I am setting up an application to use Google Cloud Vision API, and so far, I find the returned bounding boxes to only be the entire image. If for example the image is 100x100, then the bounding box is [(0,0), (100, 0), (100, 100), (0, 100)]. Even if there are multiple detected landmarks (and they are well separated in the image), they are all bounded by the entire image.
I even used the example image gs://cloud-samples-data/vision/landmark/st_basils.jpeg
, and in the document it says the output is
{
"responses": [
{
"landmarkAnnotations": [
{
"mid": "/m/014lft",
"description": "Saint Basil's Cathedral",
"score": 0.7840959,
"boundingPoly": {
"vertices": [
{
"x": 812,
"y": 1058
},
{
"x": 2389,
"y": 1058
},
{
"x": 2389,
"y": 3052
},
{
"x": 812,
"y": 3052
}
]
},
"locations": [
{
"latLng": {
"latitude": 55.752912,
"longitude": 37.622315883636475
}
}
]
}
]
}
]
}
But when I actually submit the request, I got
{
"responses": [
{
"landmarkAnnotations": [
{
"mid": "/m/0hm_7",
"description": "Red Square",
"score": 0.7341708,
"boundingPoly": {
"vertices": [
{},
{
"x": 2487
},
{
"x": 2487,
"y": 3213
},
{
"y": 3213
}
]
},
"locations": [
{
"latLng": {
"latitude": 55.753930299999993,
"longitude": 37.620794999999994
}
}
]
}
]
}
]
}
which is the entire image again.
I then watched this tutorial , and when he ran the code live, I can see some good result, but when I tried with the same image, I got nothing detected.
Has there been some significant change to the Cloud Vision API? It seems to do a lot worse than how it used to do. Is there any setting or anything I can do to improve the bounding boxes?