Skip to content
Pan-DL
GitHub

Pattern-based Approaches to NLP in the Age of Deep Learning (Pan-DL)

PAN-DL Logo

The submission deadline has been extended to 09/05/2023.

For more information, Please visit the call for papers. or submit papers by the end of the deadline day (anywhere on Earth; UTC+12) via our Softconf submission site.


Complementing deep learning with symbolic and pattern based approaches: when, why and how?

Deep-learning based approaches are dominating NLP research and routinely produce strong benchmark accuracy results. However, these approaches rely on the availability of high-quality and high-quantity data annotation. Furthermore, the learned models are difficult to interpret and incur substantial technical debt [1]: they are hard to modify or adapt to new situations without continuing data collection and retraining the models, a process which requires machine learning expertise and access to resources. As a direct result, the approaches and models are largely inaccessible to users who fail to meet these requirements. This is critical, as such users comprise a significant portion of society, and in particular include the many subject-matter experts who have deep understanding and insights into the use cases of these systems, but who are excluded from the typical deep-learning based NLP paradigm.

On the other hand, even though methods based on matching of symbolic patterns are less accurate on standard test splits than the aforementioned deep-learning based systems, they still offer significant practical advantages: they are easier to deploy and adapt, particularly under limited data availability; they support human examination of intermediate representations and reasoning steps; they are more transparent to subject-matter experts; they are amenable to having a human in the loop through intervention, manipulation and incorporation of domain knowledge; and further the resulting systems tend to be lightweight and fast. These attributes make rule-based approaches highly-valued in industry [2].

Indeed, as evidenced by an impromptu meeting around this topic at ACL 2020, there is a lot of interest in pattern-based methods, both in industry and academic circles focusing on concrete applications (bio-nlp, medical-nlp, law, etc). The meeting, which spun out of the questions in the birds-of-a-feather session on information extraction, had about 35 participants and spawned a very lively discussion that spanned many areas of NLP. One general theme in that discussion was that pattern-based methods are often preferable to learning-based solutions in practice, but this is not recognized enough in the NLP research community, partly because it is increasingly hard to publish on such approaches, or on findings in which pattern-based approaches work well. We would like to shed light on these use-cases. As a consequence of the lack of research attention, systems often still use the same approaches that were developed 50 years ago. How can we bring pattern-based approaches to the generation of modern NLP? And how can we combine both learning-based and pattern-based approaches?

This workshop will focus on all aspects of pattern-based approaches, including their application, representation, and interpretability, as well as their strengths and weaknesses relative to state-of-the-art machine learning approaches. It will also explore ways of combining the strengths of pattern-based, deep learning and other statistical methods.

We wish the workshop to be the first step in building a community of researchers from different areas of NLP, both applied and theoretical, who are interested in pattern-based approaches and who use them in their work (e.g., industry practitioners and domain experts). As such, beyond the standard keynote talks and user-contributed presentations, we plan for the workshop to include a panel discussion, breakout sessions on different sub-topics, and a meeting to discuss a potential shared task for future events.

For more information on topics of interest, please see our call for papers.

  1. Machine Learning: The High Interest Credit Card of Technical Debt
    Sculley, D., Holt, G., Golovin, D., Davydov, E., Phillips, T., Ebner, D., Chaudhary, V. and Young, M., 2014. NIPS 2014 Workshop.
  2. Rule-Based Information Extraction is Dead! Long Live Rule-Based Information Extraction Systems!
    Chiticariu, L., Li, Y. and Reiss, F.R., 2013. EMNLP, pp. 827--832. Association for Computational Linguistics.

© 2023 by Pan-DL. All rights reserved.