{"id":70171,"date":"2023-11-07T21:01:08","date_gmt":"2023-11-07T13:01:08","guid":{"rendered":"https:\/\/www.hongkiat.com\/blog\/?p=70171"},"modified":"2023-11-04T17:18:27","modified_gmt":"2023-11-04T09:18:27","slug":"chatgpt-vision","status":"publish","type":"post","link":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/","title":{"rendered":"ChatGPT Vision: What It Can and Cannot Do Currently"},"content":{"rendered":"<p>The OpenAI team has been hard at work. They\u2019ve not only <a href=\"https:\/\/www.hongkiat.com\/blog\/dall-e-3-chatgpt\/\">integrated DALL\u00b7E into ChatGPT<\/a>, but they\u2019ve also added a new Vision feature to it.<\/p>\n<figure><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg\" alt=\"ChatGPT Vision feature\" width=\"1600\" height=\"900\"><\/figure>\n<p>Vision enables interaction with ChatGPT through images and photos. You can upload a photo from your phone, or via a browser if you\u2019re using the desktop version, or you can take a new picture and upload it. After selecting the photo, click \u2018Confirm,\u2019 and then provide the question or instruction to ChatGPT.<\/p>\n<p>ChatGPT will use your image as a reference, and you can ask it all sorts of things. I\u2019ve tested it extensively, pushing it to its limits to discover its capabilities and limitations with vision. To find out more about what vision can do and assess its accuracy, continue reading.<\/p>\n<hr>\n<h3>\u2705 Recognizing Objects with Limited Info<\/h3>\n<p>First, I snapped a photo of a mobile game to see if ChatGPT could figure out what it was.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>While it didn\u2019t give the exact name of the game \u2013 since it wasn\u2019t visible in the picture \u2013 it did correctly identify it as a Monopoly-like mobile game. To me, that\u2019s a pretty accurate guess for an AI.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/object-limited-info-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/object-limited-info-prompt.jpg\" alt=\"Mobile game resembling Monopoly\" width=\"1500\" height=\"1136\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/object-limited-info-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/object-limited-info-output.jpg\" alt=\"AI identified Monopoly-like game\" width=\"1500\" height=\"241\"><\/span><\/figure>\n<hr>\n<h3>\u2705 Extracting Text from an Image<\/h3>\n<p>Then, I snapped a photo of an article on hongkiat.com to see if ChatGPT could read the text within the image.<\/p>\n<p><strong>Result:<\/strong><\/p>\n<p>It managed to read and reproduce the website\u2019s name, article title, and body text flawlessly.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/extract-text-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/extract-text-prompt.jpg\" alt=\"Article photo for text extraction\" width=\"1500\" height=\"1139\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/extract-text-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/extract-text-output.jpg\" alt=\"Extracted text from article\" width=\"1500\" height=\"1200\"><\/span><\/figure>\n<hr>\n<h3>\u2705 Extracting <em>Selected<\/em> Text from an Image<\/h3>\n<p>I also tested if ChatGPT could read just a part of an image by circling the text I was interested in.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>It successfully followed the instruction and output the required text just as well.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/extract-selected-text-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/extract-selected-text-prompt.jpg\" alt=\"Circled text for selective extraction\" width=\"1500\" height=\"1061\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/extract-selected-text-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/extract-selected-text-output.jpg\" alt=\"AI extracted circled text\" width=\"1500\" height=\"342\"><\/span><\/figure>\n<hr>\n<h3>\u2705 Interpreting a Real-World Photo<\/h3>\n<p>Later, I took a photo of a restaurant menu that included text and pictures and asked ChatGPT to itemize all the dishes along with their prices.<\/p>\n<p><strong>Result:<\/strong><\/p>\n<p>It did this perfectly.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/read-from-menu-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/read-from-menu-prompt.jpg\" alt=\"Restaurant menu photo\" width=\"1500\" height=\"899\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/read-from-menu-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/read-from-menu-output.jpg\" alt=\"Listed dishes with prices\" width=\"1500\" height=\"785\"><\/span><\/figure>\n<hr>\n<h3>\u2705 Analyzing Data from a Real-World Photo<\/h3>\n<p>I gave it another menu and this time asked for the total cost of certain items.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>It calculated the total correctly.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/calculate-from-menu-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/calculate-from-menu-prompt.jpg\" alt=\"Menu photo for cost calculation\" width=\"1500\" height=\"759\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/calculate-from-menu-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/calculate-from-menu-output.jpg\" alt=\"Calculated total cost\" width=\"1500\" height=\"466\"><\/span><\/figure>\n<hr>\n<h3>\u2705 More Complex Analysis of a Real-World Photo<\/h3>\n<p>To further test the vision feature, I took a picture of a bookshelf to see if it could estimate the number of books in the column.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>It counted 42 book spines, which is close enough, considering I estimate the actual number to be between 40 and 50.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/complex-analysis-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/complex-analysis-prompt.jpg\" alt=\"Bookshelf photo\" width=\"1500\" height=\"694\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/complex-analysis-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/complex-analysis-output.jpg\" alt=\"Estimated book count\" width=\"1500\" height=\"410\"><\/span><\/figure>\n<hr>\n<h3>\u2705 Creating Content from a Product Photo<\/h3>\n<p>Then I snapped a photo of a mug to see if it could recognize the object and generate some content for it.<\/p>\n<p><strong>Results:<\/strong><\/p>\n<p>The output it gave were pretty good!<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/content-from-product-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/content-from-product-prompt.jpg\" alt=\"Mug photo\" width=\"1500\" height=\"683\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/content-from-product-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/content-from-product-output.jpg\" alt=\"Generated content for mug\" width=\"1500\" height=\"508\"><\/span><\/figure>\n<hr>\n<h3>\u274e Retrieving EXIF Info from a Photo<\/h3>\n<p>However, there were tasks ChatGPT\u2019s Vision couldn\u2019t handle. For instance, it was unable to extract the EXIF data from the uploaded image.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/get-exif-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/get-exif-prompt.jpg\" alt=\"Photo for EXIF data\" width=\"1500\" height=\"1041\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/get-exif-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/get-exif-output.jpg\" alt=\"Failed EXIF data retrieval\" width=\"1500\" height=\"481\"><\/span><\/figure>\n<hr>\n<h3>\u274e Recognizing Objects in a Photo<\/h3>\n<p>It also can\u2019t use internet browsing to acquire information it doesn\u2019t know. For example, when I showed it a picture of a Pok\u00e9mon and asked for its name, it guessed incorrectly, likely because it can\u2019t reference the internet.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/identifying-object-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/identifying-object-prompt.jpg\" alt=\"Pok\u00e9mon photo\" width=\"1500\" height=\"1135\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/identifying-object-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/identifying-object-output.jpg\" alt=\"Incorrect Pok\u00e9mon identification\" width=\"1500\" height=\"222\"><\/span><\/figure>\n<hr>\n<h3>\u274e Recognizing Languages in a Photo<\/h3>\n<p>It struggled with foreign languages too. I showed it Chinese text, and it didn\u2019t recognize the characters or their meaning.<\/p>\n<p><strong>Prompt:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/identify-language-prompt.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/identify-language-prompt.jpg\" alt=\"Chinese text photo\" width=\"1500\" height=\"1125\"><\/span><\/figure>\n<p><strong>Output:<\/strong><\/p>\n<figure><span class=\"su-lightbox\" data-mfp-src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/original\/identify-language-output.jpg\" data-mfp-type=\"image\" data-mobile=\"yes\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/preview\/identify-language-output.jpg\" alt=\"Failed Chinese text recognition\" width=\"1500\" height=\"290\"><\/span><\/figure>\n<hr>\n<p>So, those were my tests of ChatGPT\u2019s vision feature. Overall, it\u2019s quite a useful tool that can be employed creatively. It\u2019s also worth mentioning that, at the time of writing this article, ChatGPT\u2019s Vision is only available on desktop browser versions and the iOS app.<\/p>","protected":false},"excerpt":{"rendered":"<p>ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[3398],"tags":[3545],"topic":[],"class_list":["entry-content","is-maxi"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v22.8 (Yoast SEO v27.6) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>ChatGPT Vision: What It Can and Cannot Do Currently - Hongkiat<\/title>\n<meta name=\"description\" content=\"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"ChatGPT Vision: What It Can and Cannot Do Currently\" \/>\n<meta property=\"og:description\" content=\"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/\" \/>\n<meta property=\"og:site_name\" content=\"Hongkiat\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/hongkiatcom\" \/>\n<meta property=\"article:published_time\" content=\"2023-11-07T13:01:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg\" \/>\n<meta name=\"author\" content=\"Hongkiat Lim\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@hongkiat\" \/>\n<meta name=\"twitter:site\" content=\"@hongkiat\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Hongkiat Lim\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/\"},\"author\":{\"name\":\"Hongkiat Lim\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#\\\/schema\\\/person\\\/e3613a3bf757e4f67770f0b7a339edd0\"},\"headline\":\"ChatGPT Vision: What It Can and Cannot Do Currently\",\"datePublished\":\"2023-11-07T13:01:08+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/\"},\"wordCount\":950,\"publisher\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/assets.hongkiat.com\\\/uploads\\\/chatgpt-vision\\\/hero.jpg\",\"keywords\":[\"Artificial Intelligence\"],\"articleSection\":[\"Internet\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/\",\"url\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/\",\"name\":\"ChatGPT Vision: What It Can and Cannot Do Currently - Hongkiat\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/assets.hongkiat.com\\\/uploads\\\/chatgpt-vision\\\/hero.jpg\",\"datePublished\":\"2023-11-07T13:01:08+00:00\",\"description\":\"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#primaryimage\",\"url\":\"https:\\\/\\\/assets.hongkiat.com\\\/uploads\\\/chatgpt-vision\\\/hero.jpg\",\"contentUrl\":\"https:\\\/\\\/assets.hongkiat.com\\\/uploads\\\/chatgpt-vision\\\/hero.jpg\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/chatgpt-vision\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"ChatGPT Vision: What It Can and Cannot Do Currently\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/\",\"name\":\"Hongkiat\",\"description\":\"Tech and Design Tips\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#organization\",\"name\":\"Hongkiat.com\",\"url\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/wp-content\\\/uploads\\\/hkdc-logo-rect-yoast.jpg\",\"contentUrl\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/wp-content\\\/uploads\\\/hkdc-logo-rect-yoast.jpg\",\"width\":1200,\"height\":799,\"caption\":\"Hongkiat.com\"},\"image\":{\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/hongkiatcom\",\"https:\\\/\\\/x.com\\\/hongkiat\",\"https:\\\/\\\/www.pinterest.com\\\/hongkiat\\\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/#\\\/schema\\\/person\\\/e3613a3bf757e4f67770f0b7a339edd0\",\"name\":\"Hongkiat Lim\",\"description\":\"Founder and Editor in Chief of Hongkiat.com. Hongkiat is also a designer, developer, entrepreneur, and an active investor in the US stock market.\",\"sameAs\":[\"http:\\\/\\\/www.hongkiat.com\\\/blog\"],\"url\":\"https:\\\/\\\/www.hongkiat.com\\\/blog\\\/author\\\/hongkiat\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"ChatGPT Vision: What It Can and Cannot Do Currently - Hongkiat","description":"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/","og_locale":"en_US","og_type":"article","og_title":"ChatGPT Vision: What It Can and Cannot Do Currently","og_description":"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.","og_url":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/","og_site_name":"Hongkiat","article_publisher":"https:\/\/www.facebook.com\/hongkiatcom","article_published_time":"2023-11-07T13:01:08+00:00","og_image":[{"url":"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg","type":"","width":"","height":""}],"author":"Hongkiat Lim","twitter_card":"summary_large_image","twitter_creator":"@hongkiat","twitter_site":"@hongkiat","twitter_misc":{"Written by":"Hongkiat Lim","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#article","isPartOf":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/"},"author":{"name":"Hongkiat Lim","@id":"https:\/\/www.hongkiat.com\/blog\/#\/schema\/person\/e3613a3bf757e4f67770f0b7a339edd0"},"headline":"ChatGPT Vision: What It Can and Cannot Do Currently","datePublished":"2023-11-07T13:01:08+00:00","mainEntityOfPage":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/"},"wordCount":950,"publisher":{"@id":"https:\/\/www.hongkiat.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#primaryimage"},"thumbnailUrl":"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg","keywords":["Artificial Intelligence"],"articleSection":["Internet"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/","url":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/","name":"ChatGPT Vision: What It Can and Cannot Do Currently - Hongkiat","isPartOf":{"@id":"https:\/\/www.hongkiat.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#primaryimage"},"image":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#primaryimage"},"thumbnailUrl":"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg","datePublished":"2023-11-07T13:01:08+00:00","description":"ChatGPT now has the capability to process images; simply take a picture of a complex serial number, and it will read and output the text for you.","breadcrumb":{"@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#primaryimage","url":"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg","contentUrl":"https:\/\/assets.hongkiat.com\/uploads\/chatgpt-vision\/hero.jpg"},{"@type":"BreadcrumbList","@id":"https:\/\/www.hongkiat.com\/blog\/chatgpt-vision\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.hongkiat.com\/blog\/"},{"@type":"ListItem","position":2,"name":"ChatGPT Vision: What It Can and Cannot Do Currently"}]},{"@type":"WebSite","@id":"https:\/\/www.hongkiat.com\/blog\/#website","url":"https:\/\/www.hongkiat.com\/blog\/","name":"Hongkiat","description":"Tech and Design Tips","publisher":{"@id":"https:\/\/www.hongkiat.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.hongkiat.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.hongkiat.com\/blog\/#organization","name":"Hongkiat.com","url":"https:\/\/www.hongkiat.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.hongkiat.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.hongkiat.com\/blog\/wp-content\/uploads\/hkdc-logo-rect-yoast.jpg","contentUrl":"https:\/\/www.hongkiat.com\/blog\/wp-content\/uploads\/hkdc-logo-rect-yoast.jpg","width":1200,"height":799,"caption":"Hongkiat.com"},"image":{"@id":"https:\/\/www.hongkiat.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/hongkiatcom","https:\/\/x.com\/hongkiat","https:\/\/www.pinterest.com\/hongkiat\/"]},{"@type":"Person","@id":"https:\/\/www.hongkiat.com\/blog\/#\/schema\/person\/e3613a3bf757e4f67770f0b7a339edd0","name":"Hongkiat Lim","description":"Founder and Editor in Chief of Hongkiat.com. Hongkiat is also a designer, developer, entrepreneur, and an active investor in the US stock market.","sameAs":["http:\/\/www.hongkiat.com\/blog"],"url":"https:\/\/www.hongkiat.com\/blog\/author\/hongkiat\/"}]}},"jetpack_featured_media_url":"https:\/\/","jetpack_shortlink":"https:\/\/wp.me\/p4uxU-ifN","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/posts\/70171","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/comments?post=70171"}],"version-history":[{"count":3,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/posts\/70171\/revisions"}],"predecessor-version":[{"id":70179,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/posts\/70171\/revisions\/70179"}],"wp:attachment":[{"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/media?parent=70171"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/categories?post=70171"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/tags?post=70171"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/www.hongkiat.com\/blog\/wp-json\/wp\/v2\/topic?post=70171"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}