Online photo libraries face the problem of organizing their rapidly growing image collections. Fast and reliable image retrieval requires good qualitative captions added to a photo; however, this is considered by photographers as a time-consuming and annoying task. In order to do it in a fully automated way, the process of augmenting a photo with captions or labels starts by identifying the objects that the photo depicts. Previous attempts for a fully automatic process using computer vision technology only proved not to be optimal due to calibration issues. Existing photo annotation tools from GPS or geo-tagging services can only apply generic location information to add textual descriptions about the context and surroundings of the photo, not actually what the photo shows. To be able to exactly describe what is captured on a digital photo, the view orientation is required to exactly identify the captured scene extent and identify the features from existing spatial datasets that are within the extent. Assumption that camera devices with integrated GPS and digital compass will become available in the near future, our research introduces an approach to identify and localize captured objects on a digital photo using this full spatial metadata. It proposes the use of GIS technology and conventional spatial data sets to place a label next to a pictured object at its best possible location.