Authors: Shouvik Mani, Michael A. Haddad, Dan Constantini, Willy Douhard, Qiwei Li, Louis Poirier Description: A Piping and Instrumentation Diagram (P&ID) is a type of engineering diagram that uses symbols, text, and lines to represent the components and flow of an industrial process. Although used universally across industries such as manufacturing and oil & gas, P&IDs are usually trapped in image files with limited metadata, making their contents unsearchable and siloed from operational or enterprise systems. In order to extract the information contained in these diagrams, we propose a pipeline for automatically digitizing P&IDs. Our pipeline combines a series of computer vision techniques to detect symbols in a diagram, match symbols with associated text, and detect connections between symbols through lines. For the symbol detection task, we train a Convolutional Neural Network to classify certain common symbols with over 90% precision and recall. To detect connections between symbols, we use a graph search approach to traverse a diagram through its lines and discover interconnected symbols. By transforming unstructured diagrams into structured information, our pipeline enables applications such as diagram search, equipment-to-sensor mapping, and asset hierarchy creation. When integrated with operational and enterprise data, the extracted asset hierarchy serves as the foundation for a facility-wide digital twin, enabling advanced applications such as machine learning-based predictive maintenance.