LLMs Boosts Scientific Research Discovery

In a paper published in the journal Information, researchers proposed utilizing large language models (LLMs) like a generative pre-trained transformer (GPT-3.5), Llama 2, and Mistral to automatically suggest properties beyond traditional keywords, aiming to enhance the findability of scientific research.

Research field distribution of the selected papers in our evaluation dataset containing human-domain-expert-annotated properties that were applied to represent the paper’s structured contribution descriptions in the Open Research Knowledge Graph (ORKG). Image Credit: https://www.mdpi.com/2078-2489/15/6/328
Research field distribution of the selected papers in our evaluation dataset containing human-domain-expert-annotated properties that were applied to represent the paper’s structured contribution descriptions in the Open Research Knowledge Graph (ORKG). Image Credit: https://www.mdpi.com/2078-2489/15/6/328

They compared manually curated properties in the open research knowledge graph (ORKG) with those generated by LLMs, assessing performance through semantic alignment, fine-grained property mapping accuracy, scientific nearest concept lookup (SciNCL) embedding-based cosine similarity, and expert surveys within a multidisciplinary science context. While LLMs showed promise in structuring science, further refinement was recommended to better align with scientific tasks and human expertise.


Past work has extensively investigated the utilization of LLMs in scientific literature analysis, encompassing tasks like summarization, insights extraction, and literature reviews. However, the specific application of LLMs for recommending research dimensions is relatively new. Recent advancements include the development of domain-specific language models like scientific bidirectional encoder representations from transformers (SciBERT), scientific paper embeddings using citation-context information (SPECTER), and SciNCL.

Evaluations comparing LLM-generated dimensions with manually curated properties have employed similarity measures such as cosine and Jaccard similarity. Additionally, LLMs have been utilized as evaluators, showcasing their potential in assessing generated content quality.

Evaluation and Analysis Framework

In the section, three subsections outline the approach. Firstly, the creation of the gold-standard evaluation dataset from the ORKG with human-domain-expert-annotated research comparison properties used to assess their similarity to LLM-generated properties is described.

Secondly, an overview of the three LLMs, viz., GPT-3.5, Llama 2, and Mistral, applied to automatically generate the research comparison properties, highlighting their respective technical characteristics. Lastly, the various evaluation methods used in this study are discussed, offering differing perspectives on the similarity comparison of ORKG properties for the instances in the gold-standard dataset versus those generated by the LLMs.

The analysts detailed the process of curating the evaluation dataset from ORKG comparisons. This dataset comprises structured papers from diverse research fields, each accompanied by human-annotated properties. These properties reflect nuanced aspects of research contributions across various domains, which is essential for comparative analysis with LLM-generated dimensions. The distinction between ORKG properties and research dimensions is elucidated, emphasizing the broader context provided by the latter in analyzing research problems.

The researchers discussed the selection and characterization of three LLMs—GPT-3.5, Llama 2, and Mistral—for generating research dimensions. Each model is assessed based on its parameters, accessibility, and performance, highlighting their suitability for the evaluation tasks. Furthermore, the methodology for designing prompts tailored to each LLM to ensure optimal performance in generating research dimensions is outlined.

The investigators outlined the approach to evaluating the similarity between ORKG properties and LLM-generated research dimensions. Multiple evaluation techniques include semantic alignment and deviation assessments using GPT-3.5, property-dimension mappings, and embedding-based semantic distance evaluations. Additionally, the human assessment survey conducted to gauge the utility of LLM-generated dimensions compared to domain-expert-annotated ORKG properties is described.

LLM Performance Evaluation

The evaluation section delves into the performance assessment of LLMs by comparing them to ORKG properties and recommending research dimensions. It employs various similarity assessments, including semantic alignment, deviation evaluations, property mappings, and embedding-based analyses.

Results indicate a moderate alignment between paper properties and research dimensions, with LLM-generated dimensions showing diversity but lower similarity to ORKG properties. It highlights the challenge of replicating expert annotation using LLMs and suggests avenues for improving alignment through domain-specific fine-tuning.

In-depth analysis reveals a discrepancy in mappings between paper properties and research dimensions, emphasizing the varied scopes of ORKG properties and research dimensions. While LLMs offer diversity in dimension generation, their alignment with expert-annotated ORKG properties remains a hurdle.

However, embedding-based evaluations demonstrate a high semantic similarity between LLM-generated dimensions and ORKG properties, particularly with GPT-3.5. It underscores the potential of LLMs in automating research metadata creation, albeit with room for further refinement.

Overall, the evaluation underscores the capability of LLMs to generate research dimensions aligned with expert-annotated ORKG properties, albeit with some challenges. Despite the need for improvement in specificity and alignment with research goals, LLMs offer valuable support in structuring research contributions and comparisons. It highlights the promising role of artificial intelligence (AI) tools like LLMs in enhancing knowledge organization within platforms such as the ORKG, paving the way for more efficient and effective research dissemination and discovery.


In summary, the study investigated the efficacy of LLMs in recommending research dimensions, focusing on their alignment with manually curated ORKG properties and their potential to automate research metadata creation. A moderate alignment was found between LLM-generated dimensions and expert-curated properties, alongside challenges in replicating the nuanced expertise of domain experts.

While LLMs offered diversity in dimension generation, their alignment with expert-curated properties remained a hurdle. Future research should explore fine-tuning LLMs on scientific domains to enhance their performance in recommending research dimensions, advancing their potential in automating research metadata creation, and improving knowledge organization.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.


Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2024, June 13). LLMs Boosts Scientific Research Discovery. AZoAi. Retrieved on July 17, 2024 from https://www.azoai.com/news/20240613/LLMs-Boosts-Scientific-Research-Discovery.aspx.

  • MLA

    Chandrasekar, Silpaja. "LLMs Boosts Scientific Research Discovery". AZoAi. 17 July 2024. <https://www.azoai.com/news/20240613/LLMs-Boosts-Scientific-Research-Discovery.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "LLMs Boosts Scientific Research Discovery". AZoAi. https://www.azoai.com/news/20240613/LLMs-Boosts-Scientific-Research-Discovery.aspx. (accessed July 17, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2024. LLMs Boosts Scientific Research Discovery. AZoAi, viewed 17 July 2024, https://www.azoai.com/news/20240613/LLMs-Boosts-Scientific-Research-Discovery.aspx.


The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Is ChatGPT 3.5 Funnier than Humans?