Generating images with transfer learning

Generating images with transfer learning is really interesting.

I’ve been playing with a new technique that has taken deep learning by storm: neural style transfer. Initially created by Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge in 2015, neural style transfer is an optimization technique used to take three images, a content image, a style reference image (such as an artwork by a famous painter), and the input image you want to style — and blend them together such that the input image is transformed to look like the content image, but “painted” in the style of the style image.

I have uploaded the generated images here: https://ello.co/datascientist1

Product categorization and product tagging machine learning solutions

When you enter a physical store, you have tables above aisles that denote the category of products that are being sold in particular section.

The online equivalent of this are the product paths that show the main category, subcategory, level 3 category, and so on. In earlier days of internet these categories were set manually but with Amazon now selling millions of products, the product categorization has been automated in the meantime, using the machine learning and AI models for this purpose. There are millions of niches available for products to sell.

All one needs for training a ML model is a good training data set in the form of product name -> categories and then one can train an appropriate text classification model.

Regarding possible categories, there are two main options. One is Google Product Taxonomy and the other is IAB classification, both have Tier 1, Tier 2 and lower Tiers of categories.

For ecommerce product categorization, the more appropriate is the Google product taxonomy categorization, whereas IAB is more general and has gone through several revisions.

An excellent Saas platform that offers product categorization is Productcategorization.com. You can try out their demo at this address:

https://www.productcategorization.com/demo_dashboard/

It gives the output in form of nice chart as well as json file that can also be exported.

Another important version of categorization involves tagging of products. This is a more modern version of trying to classify the products and has the added benefit that there is no general upper limit on number of tags that can be assigned to products of online ecommerce shop.

A solution offering product tagging is available at producttagging.com. You can try out demo of product tagging at:

https://www.producttagging.io/demo_dashboard/

I tried diamond earring and got these tags (with percentages denoting how relevant is the tag for the product name):

diamond
58 %
earrings
33 %
earring
22 %
diamond earrings
7 %
jewelry
6 %
white gold
6 %
diamonds
6 %
yellow gold
4 %
hoop
3 %
bridal

2 %

If you are interested in learning more details on theoretical background to product categorization, check out an article on this topic:

https://medium.com/product-categorization/product-categorization-introduction-d62bb92e8515

Interesting collection of slides on the topic of product categorization: https://slides.com/categorization

Product tagging and categorization have a bright future with number of online shops rapidly increasing.

Another text, more general text classification problem is website categorization.

Crypto social media analysis

Social media has played an important role in driving the narrative around cryptocurrency sector in recent years. Although initial paper of Satoshi Nakamoto was published on forum posts, see e.g. satoshi nakamoto posts, in the later years the hype about cryptocurrencies was nevertheless substantially driven on social media, especially twitter.

It is interesting that in recent times the social media became important also for the stock market sector, where subreddits like https://www.reddit.com/r/stocks/ have been important drivers of stocks like it happened with Gamestop earlier this year. Social media is increasingly democratizing the information of crowds, solving one of the earlier pain points of finance – namely how to inform people about the financial stocks. Though one would also strongly advise that the new investors pay a lot of attention to fundamental data about stocks.

But back to crypto social media analysis. How does one approach this?

First is to built a bot, which regularly analyses twitter, reddit, youtube and other social media websites. When analysing given text, one parses it to find mentions of cryptocurrency tickers and names, e.g. BTC and Bitcoin. Python library flashtext: https://github.com/vi3k6i5/flashtext

Here is an example of news title (from sentiment api), that have been tagged with respective cryptocurrencies:

Then, the text is classified in terms of sentiment. One way to build a classifier is for example by using Support Vector Machines for this purpose.

Both types of data gives us an effective way of crypto social media analysis – it allows us to display information both about the number of social media mentions of cryptocurrencies as well as about their sentiment.

The interesting thing is that the social media mentions often closely follow price, here is an example for Bitcoin:

In the last few days the relation was almost 1:1. It is thus useful to take crypto social media analysis as additional source of information into account when analysing the crypto market.

Data Visualization Consulting

There is an old saying that one picture is worth a thousand words and in the modern content marketing this is often true. The most viral posts that one encounters are often the ones where someone produces an interesting presentation of unique data set and its analysis.

Another topic that also gets a lot of interest are infographic.

Data Visualization Consulting has thus emerged in the recent years as an important way to generate interests for content and thereby via acquiring a lot of links also improved search engine rankings.

Our AI company for Data Visualization Services Consulting specializes in producing unique and great data visualization charts and images to help the clients in providing unique stories and angles on their content.

We also provide a platform for Keyword, Niches and Trends Research – UnicornSEO, which allows you to explore complete niches in an in-depth way.

Here are a couple of images from our UnicornSEO platform that show the potential for data visualizations in content marketing:

Geo location of photos using deep learning

Computer vision is part of AI consulting tasks that often involve classification problems where one tries to train a deep learning neural net to classify a given image in one of discrete classes.

Typical examples are for example classifying images of animals, food, etc.

Classical problem from this set was classifying images as either cat or dog, see e.g. https://www.kaggle.com/c/dogs-vs-cats

Transfer learning

In cases like this one often uses the benefits of transfer learning. This means that one significantly short the time of development to train the NN for a particular CV problem by starting with the pre-trained neural net that was trained on some other computer vision problem.

It is common to use pre-trained models from well know and researched problems. Examples of pre-trained computer vision models are VGG  or Inception model.

Geo location from photos

Recently, as part of computer vision consulting, I came across a quite unique problem for computer vision, which involves a very interesting classification from images, where the results is a set of location coordinates, latitude and longitude.

In other words, given an image, the deep learning net tries to determine the physical location where the image was taken, giving a pair of number for latitude and longitude.

There are various researchers that took up this challenge. Several years ago, researchers with Google were some of the first with their PlaNet solution:

https://arxiv.org/abs/1602.05314

On first sight, the problem looks very difficult. One can easily find a picture where it is hard to detect the location. However, many images contain a lot of information due to presence of landmarks, typical vegetation, weather, architectural features and similar.

The approach taken by the PlaNet solution and another solution that we will describe shortly is to partition the surface of the earth in thousands of cells and then use a big set of geotagged images for classification. Example of huge dataset containing a large number of geotagged images is e.g. Flickr.

Another interesting approach is the one taken by the team from Leibniz Information Centre for Science and Technology (TIB), Hannover and 2 L3S Research Center, Leibniz Universitaet Hannover in Germany.

Their approach is similar to PlaNet – they divide the whole earth in cells but they also have a special decision layer which takes into account the scene content – whether it is indoor, natural or an urban setting.

I implemented their library https://github.com/TIBHannover/GeoEstimation  and can confirm it works with surprisingly good results.

The team has also put out an online version of their model and you can check it out here:

https://tibhannover.github.io/GeoEstimation/

If I send this image to the photo geo location tool:

The deep learning tool correctly puts the image in the mediterranean region (its correct location is Ibiza, Spain).