Damian Mehers' Blog Android, VR and Wearables from Geneva, Switzerland.

28Oct/122

Automatically generating Evernote note titles from text in images

Once a month I scan my physical mail and receipts into Evernote, and then rename each of the new notes to something meaningful.

Before I rename them, the new notes have titles such as “CCE272012_00003.pdf” and “CCI282012_00029.jpg”:

image

I wondered whether it might be possible to write a program to automatically generate a title from the text that Evernote finds within the scanned image …

It turns out that it is possible.  These are exactly the same notes as shown above, with their new titles automatically generated from the text Evernote found in the scanned images.

image

This is just a proof-of-concept although if you are interested in using it then please let me know in comment to this post, so that I can look at making it available as a web-based utility.

How does it work?

If you are not a developer then this won’t mean much to you – feel free to bail now!

Essentially the program looks for all notes with a single resource which has recognition data, and which matches specific search criteria (for example in a specific notebook and with a specific title).  It concatenates the best-match words from the first couple of lines of text in the image, if the search-weight is above a certain level.

Here is the program (in C#):

using System;
using System.Linq;
using System.Text;
using System.Xml.Linq;
using Evernote.EDAM.NoteStore;
using Evernote.EDAM.Type;
using Evernote.EDAM.UserStore;
using Thrift.Protocol;
using Thrift.Transport;

namespace AutoTitleEvernote {
  internal class Program {
    // Get this from https://www.evernote.com/api/DeveloperToken.action
    private const string AuthToken = "...";

    // Change this as appropriate
    private const string SearchString = @"Notebook:Inbox intitle:CC*";
    private const int MaxNotes = 1000; // Max nr of notes we will process
    private const int LineFudge = 10; // Used to determine if words are on the same line 
    private const int Lines = 2; // How many lines of text to read
    private const int MinWeight = 50; // How good a match do we need on words
    private const string HostUrl = "https://www.evernote.com/edam/user";

    private static void Main() {

      var noteStoreClient = GetNoteStoreClient(AuthToken);

      // Find notes with the required title in the required notebook
      var filter = new NoteFilter { Words = SearchString, 
                                    Ascending = false, 
                                    Order = (int)NoteSortOrder.CREATED };
      var notes = noteStoreClient.findNotes(AuthToken, filter, 0, MaxNotes);

      // For each note with a single resource and recogition data
      foreach (var note in notes.Notes.Where(n => n.Resources != null &&
                                                  n.Resources.Count == 1 &&
                                                  n.Resources.First().Recognition != null)) {
        // Download and parse the recognition XML
        var recoXmlBytes = noteStoreClient.getResourceRecognition(AuthToken,
                                                                  note.Resources.First().Guid);
        var recoXml = XElement.Parse(Encoding.UTF8.GetString(recoXmlBytes));

        var items = recoXml.Elements("item").ToList();
        var title = new StringBuilder();

        var lineY = -1;
        var line = 0;

        // For each word
        foreach (var item in items) {
          // Keep track of the current line
          var y = int.Parse(item.Attributes("y").First().Value);
          if (lineY == -1) {
            lineY = y;
          }
          else {
            if (y > lineY + LineFudge) {
              if (++line > Lines) {
                break; // We've moved beyond the number of candidate lines
              }
              lineY = y;
            }
          }

          // Find the word's text and weight if the weight is above the criteria
          var word = (from t in item.Elements("t")
                      let weight = int.Parse(t.Attribute("w").Value)
                      orderby weight descending
                      where weight > MinWeight
                      select new { Weight = weight, Text = t.Value }).FirstOrDefault();

          if (word == null ||
              title.Length + word.Text.Length + 1 >=
                     Evernote.EDAM.Limits.Constants.EDAM_NOTE_TITLE_LEN_MAX) {
            break;
          }
          if (title.Length > 0) {
            title.Append(" ");
          }
          title.Append(word.Text);
          // title.Append("[" + word.Weight + "]");
        }
        Console.Out.Write("Rename " + note.Title + " to " + title + "? ");
        Console.Out.Flush();
        var input = Console.In.ReadLine();

        if (input == null || input.ToLower() != "y") {
          continue;
        }

        note.Title = title.ToString();
        noteStoreClient.updateNote(AuthToken, note);
      }

    }

    private static NoteStore.Client GetNoteStoreClient(string authToken) {
      var userStoreUrl = new Uri(HostUrl);
      var userStoreTransport = new THttpClient(userStoreUrl);
      var userStoreProtocol = new TBinaryProtocol(userStoreTransport);
      var userStore = new UserStore.Client(userStoreProtocol);

      var noteStoreUrl = userStore.getNoteStoreUrl(authToken);

      var noteStoreTransport = new THttpClient(new Uri(noteStoreUrl));
      var noteStoreProtocol = new TBinaryProtocol(noteStoreTransport);
      var noteStore = new NoteStore.Client(noteStoreProtocol);
      return noteStore;
    }
  }
}

image

A word about Evernote’s text recognition

One of the many cool things about Evernote is that it automatically finds text in scanned images and PDFs so that when you subsequently search for that text, you can find the corresponding notes.

Evernote’s text recognition is incredibly powerful. It finds text that is at an angle, hand-written, and in various languages.

It is powerful, but it is not designed to provide a text version of a scanned document. Instead it is designed to make searching for words work very well. For example for each part of an image where it detects a word, it has a series of words that might match, each with a “weight” indicating how good a match it thinks it is.

What I’m doing here is not really in line with Evernote’s text recognition capability’s goals … but it does work …

Filed under: Evernote Leave a comment
Comments (2) Trackbacks (0)
  1. Hi Damian,

    I am an enthusiastic Evernote user from Zurich and have stumbled across your blog here.
    Your approach to generate titles for the notes is great.

    I tried to compile the code but am not an experienced C# programmer, I am more acquainted with Java.
    Will you ever provide this as a web-based utility?

    Have a nice Xmas break.
    Regards,
    Mike

  2. Hi Damian,

    This would be a great feature.

    Currently, the Windows desktop client automatically picks up the first ~80 characters (Evernote.EDAM.Limits.Constants.EDAM_NOTE_TITLE_LEN_MAX) to be set as the note title, if it has not been given one.

    As an enhancement, Evernote team could enable setting the title automatically for untitled notes (text notes and notes that contain images that have text recognized in them – based on your criteria above) on all platforms.

    Happy New year!


Leave a comment

No trackbacks yet.