Name | Description |
---|---|
Chat GPT | Conversational AI with Text Prompt |
T5 | Prompt-based text synthesis |
SciSpace | Automatic literature review and data extraction from PDFs |
Consensus AI | Literature search, synthesis, and Q&A |
Elicit | Literature search, synthesis, and Q&A |
Falcon | LLM that has integrated extensive scientific literature in pretraining |
LLM4SD | LLM for scientific discovery in physiology, biophysics, physical chemistry, and quantum mechanics |
ChatPDF | Chatbot with PDFs |
Claude | Chatbot with PDFs |
Perplexity.ai | Chatbot retrained with Wikipedia, etc. and your own PDFs |
LLMs trained on scientific publications (updated regularly)
Name | Description |
---|---|
BLOOM | BLOOM (BigScience Language Open-science Open-access Multilingual): the BigScience 176 billion parameters model is currently training. |
SciBERT | SciBERT is a BERT model trained on scientific text. SciBERT is trained on papers from the corpus of semanticscholar.org. |
GALACTICA by Meta | LLM rained on over 48 million papers, textbooks, reference material, compounds, proteins and other sources of scientific knowledge. It's taken offline due to misinformation. |
IBM-NASA Models | Trained on 60 billion tokens on a corpus of astrophysics, planetary science, earth science, heliophysics, and biological and physical sciences data. |
Mozi | Mozi is the first large-scale language model for the scientific paper domain, such as question answering and emotional support. With the help of the large-scale language and evidence retrieval models, SciDPR, Mozi generates concise and accurate responses to users' questions about specific papers and provides emotional support for academic researchers. |
Awesome Scientific Language Models | A curated list of pre-trained language models in scientific domains (e.g., mathematics, physics, chemistry, materials science, biology, medicine, geoscience). |
Selected readings