Leveraging natural language processing to curate the tmCAT, tmPHOTO, tmBIO, and tmSCO datasets of functional transition metal complexes
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA, Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA, Department of Chemistry, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Leveraging natural language processing models including transformers, we curate four distinct datasets: tmCAT for catalysis, tmPHOTO for photophysical activity, tmBIO for biological relevance, and tmSCO for magnetism.
- Sponsoring Organization:
- USDOE
- Grant/Contract Number:
- SC0016214
- OSTI ID:
- 2447510
- Journal Information:
- Faraday Discussions, Journal Name: Faraday Discussions Vol. 256; ISSN 1359-6640; ISSN FDISE6
- Publisher:
- Royal Society of Chemistry (RSC)Copyright Statement
- Country of Publication:
- United Kingdom
- Language:
- English
Similar Records
Crowdsourcing and curation: perspectives from biology and natural language processing
Introduction to natural language processing
Hpc Natural Language Understanding (nlu) Dataset
Journal Article
·
2016
· Database
·
OSTI ID:1360095
+3 more
Introduction to natural language processing
Book
·
1983
·
OSTI ID:5190656
Hpc Natural Language Understanding (nlu) Dataset
Software
·
2023
·
OSTI ID:code-109590