Use accessibility APIs of OSes that contains building blocks. These building blocks have 3 kinds of information about a UI element.
The first is a role: What kind of element is this?
The second is a name: a label or an identifier for this element.
The third is state and other properties: Other functional aspects of an element that would be relevant for a user or an assistive technology to be aware of. Is a checkbox checked or unchecked, etc...
It looks like ARIA, oh wait!
The web browsers now expose this accessibility tree.