Tag: SAS Visual Text Analytics

Classifying messy documents: A common-sense approach (Part II)

by Daria Rostovtseva • August 16, 2021 • Comments Off

In Part I of this blog post, I provided an overview of the approach my team and I took tackling the problem of classifying diverse, messy documents at scale. I shared the details of how we chose to preprocess the data and how we created features from documents of interest […]

Classifying messy documents: A common-sense approach (Part II) was published on SAS Users.

Classifying messy documents: A common-sense approach (Part I)

by Daria Rostovtseva • August 4, 2021 • Comments Off

Unstructured text data is ubiquitous in both business and government and extracting value from it at scale is a common challenge. Organizations that have been around for a while often have vast paper archives. Digitizing these archives does not necessarily make them usable for search and analysis, since documents are […]

Classifying messy documents: A common-sense approach (Part I) was published on SAS Users.