Identify pixels enclosed in the bounding rectangle, obtain convex hull, by greedily joining neighbouring points in the hull so as to minimize added area until quadrilateral is obtained.
Refine quadrilateral to sub-pixel accuracy by matching the sigil image. This should involve inverting a matrix at some stage, if you are doing it right.
A webcam spews out a high amount of information, hopefully this can be translated into very accurate positioning.
Monkey carves wood, guided by computer.
Webcams are cheap. Computers are cheap. Monkeys are cheap.