Foundation models as context

One of the most potent forms of context for statistical estimation and inference is prior knowledge. While this has historically required expert humans to apply this prior knowledge on each new problem, we now have foundation models that capture an entire domain in a large black-box model. Importantly, context-adaptive learning systems allow us to extract this prior knowledge by connecting foundation models to structured statistical models.

References

2023

Data Science with LLMs and Interpretable Models

Sebastian Bordt, Ben Lengerich, Harsha Nori, and 1 more author

AAAI Explainable AI for Science, 2023

Abs arXiv Bib

Recent years have seen important advances in the building of interpretable models, machine learning models that are designed to be easily understood by humans. In this work, we show that large language models (LLMs) are remarkably good at working with interpretable models, too. In particular, we show that LLMs can describe, interpret, and debug Generalized Additive Models (GAMs). Combining the flexibility of LLMs with the breadth of statistical patterns accurately described by GAMs enables dataset summarization, question answering, and model critique. LLMs can also improve the interaction between domain experts and interpretable models, and generate hypotheses about the underlying phenomenon. We release TalkToEBM as an open-source LLM-GAM interface.
@article{bordt2024data, author = {Bordt, Sebastian and Lengerich, Ben and Nori, Harsha and Caruana, Rich}, title = {Data Science with LLMs and Interpretable Models}, journal = {AAAI Explainable AI for Science}, year = {2023}, informal_venue = {AAAI XAI4Sci}, keywords = {Interpretable, LLMs}, }

2018

Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations

Benjamin J. Lengerich, Andrew Maas, and Christopher Potts

In Proceedings of the 27th International Conference on Computational Linguistics (COLING) , 2018

Abs arXiv Bib HTML

Knowledge graphs are a versatile framework to encode richly structured data relationships, but it can be challenging to combine these graphs with unstructured data. Methods for retrofitting pre-trained entity representations to the structure of a knowledge graph typically assume that entities are embedded in a connected space and that relations imply similarity. However, useful knowledge graphs often contain diverse entities and relations (with potentially disjoint underlying corpora) which do not accord with these assumptions. To overcome these limitations, we present Functional Retrofitting, a framework that generalizes current retrofitting methods by explicitly modeling pairwise relations. Our framework can directly incorporate a variety of pairwise penalty functions previously developed for knowledge graph completion. Further, it allows users to encode, learn, and extract information about relation semantics. We present both linear and neural instantiations of the framework. Functional Retrofitting significantly outperforms existing retrofitting methods on complex knowledge graphs and loses no accuracy on simpler graphs (in which relations do imply similarity). Finally, we demonstrate the utility of the framework by predicting new drug–disease treatment pairs in a large, complex health knowledge graph.
@inproceedings{lengerich2018retrofitting, title = {Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations}, author = {Lengerich, Benjamin J. and Maas, Andrew and Potts, Christopher}, journal = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)}, booktitle = {Proceedings of the 27th International Conference on Computational Linguistics (COLING)}, year = {2018}, informal_venue = {COLING}, keywords = {Natural Language Processing, Knowledge Graphs}, }