Breaking News

TerraMaster Prime Day 2026 Sale Offers Up to 25% Off XPG Launches INFINITY Fans and MAESTRO Air Coolers Noctua introduces NL-LC1 all-in-one liquid coolers SAMA S50 Rethinks Compact ATX Cases Viltrox Launches AF 28mm F4.5 Chip L-mount Lens

logo

  • Share Us
    • Facebook
    • Twitter
  • Home
  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map

Search form

Researchers Advance Image Recognition Technology

Researchers Advance Image Recognition Technology

Enterprise & IT Nov 18,2014 0

Google Research scientists have have created artificial intelligence software capable of recognizing and describing the content of photographs and videos with greater accuracy than ever before. Google's machine-learning system can automatically produce captions to accurately describe images the first time it sees them. This kind of system could eventually help visually impaired people understand pictures, provide alternate text for images in parts of the world where mobile connections are slow, and make it easier for everyone to search on Google for images.

The idea comes from recent advances in machine translation between languages, where a Recurrent Neural Network (RNN) transforms, say, a French sentence into a vector representation, and a second RNN uses that vector representation to generate a target sentence in German.

The researchers replaced that first RNN and its input words with a deep Convolutional Neural Network (CNN) trained to classify objects in images. Normally, the CNN’s last layer is used in a final Softmax among known classes of objects, assigning a probability that each object might be in the image. But by removing that final layer, reseearchers instead fed the CNN’s rich encoding of the image into a RNN designed to produce phrases. The whole system was trained directly on images and their captions, so they managed to maximize the likelihood that descriptions it produces best match the training descriptions for each image. The model combines a vision CNN with a language-generating RNN so it can take in an image and generate a fitting natural-language caption.

Google says that its experiments with this system on several openly published datasets, including Flickr8k, Flickr30k and SBU, showed qualitative results. It also performed well in quantitative evaluations with the Bilingual Evaluation Understudy (BLEU), a metric used in machine translation to evaluate the quality of generated sentences.

To get more details about the framework used to generate descriptions from images, as well as the model evaluation, read the full paper here.

Tags: Google
Previous Post
China Blocks Edgecast Websites
Next Post
Microsoft Surface Pro 3 Update Fixes Bugs

Related Posts

  • Google announces Pixel 10, Pixel 10 Pro Fold and Pixel Buds 2a

  • Elevate your gameplay across mobile and PC

  • What’s new in Android 15, plus more updates

  • NVIDIA Teams Up With Google DeepMind to Drive Large Language Model Innovation

  • Google at CES 2024

  • Google introduces Gemini AI model

  • Google Cloud Launches AI-Powered Anti Money Laundering Product for Financial Institutions

  • Connecting all things Android at MWC Barcelona

Latest News

TerraMaster Prime Day 2026 Sale Offers Up to 25% Off
Enterprise & IT

TerraMaster Prime Day 2026 Sale Offers Up to 25% Off

XPG Launches INFINITY Fans and MAESTRO Air Coolers
Cooling Systems

XPG Launches INFINITY Fans and MAESTRO Air Coolers

Noctua introduces NL-LC1 all-in-one liquid coolers
Cooling Systems

Noctua introduces NL-LC1 all-in-one liquid coolers

SAMA S50 Rethinks Compact ATX Cases
Cooling Systems

SAMA S50 Rethinks Compact ATX Cases

Viltrox Launches AF 28mm F4.5 Chip L-mount Lens
Cameras

Viltrox Launches AF 28mm F4.5 Chip L-mount Lens

Popular Reviews

Akaso 360 Action camera

Akaso 360 Action camera

Dragon Touch Digital Calendar

Dragon Touch Digital Calendar

Endorfy Thock V2 Wireless Keyboard

Endorfy Thock V2 Wireless Keyboard

be quiet! Pure Loop 3 280mm

be quiet! Pure Loop 3 280mm

Noctua NF-A12x25 G2 fans

Noctua NF-A12x25 G2 fans

Soft2bet and the unseen hardware that makes instant play possible

Soft2bet and the unseen hardware that makes instant play possible

Crucial T710 2TB NVME SSD

Crucial T710 2TB NVME SSD

be quiet! Pure power 13M 750W

be quiet! Pure power 13M 750W

Main menu

  • Home
  • News
  • Reviews
  • Essays
  • Forum
  • Legacy
  • About
    • Submit News

    • Contact Us
    • Privacy

    • Promotion
    • Advertise

    • RSS Feed
    • Site Map
  • About
  • Privacy
  • Contact Us
  • Promotional Opportunities @ CdrInfo.com
  • Advertise on out site
  • Submit your News to our site
  • RSS Feed