SOLR Dismax search, using Haystack via Django API

In this tutorial, we'll see how to query an Apache Solr instance from a Django console in order to obtain a list of results and apply some filtering to it, using Dismax search. For this tutorial, I will assume that you are familiar with the following technologies and that you have them installed:

  • Django Framework
  • Django API
  • Apache Solr
  • Django Haystack module

I will also asume that your Django is able to access Solr via Haystack. If it isn't, make it work by reading Haystack documentation: http://haystacksearch.org/

For information, the platform I'll be using for this example is Openshift, as I have installed there everything needed. Now, let's start by studying a basic solr query output (retrieve all results):

/collection1/select?q=*%3A*&wt;=json&indent;=true

    {
  "responseHeader":{
    "status":0,
    "QTime":0,
    "params":{
      "indent":"true",
      "q":"*:*",
      "wt":"json"}},
  "response":{"numFound":2,"start":0,"docs":[
      {
        "company_name":"SC Something Ltd.",
        "tags":["pistols",
          "rifles",
          "guns"],
        "text":"Something someone somewhere sometime",
        "external_link":"http://something-somewhere.com",
        "_version_":1527471593000796160},
      {
        "company_name":"Miniprix Someone LLC.",
        "tags":["python",
          "dhtml",
          "javascript"],
        "text":"Lorem ipsum dolor something someone somewhere",
        "external_link":"http://miniprixter.com",
        "_version_":1527471593010233344}]
  }}

From what we can see, we have a set of two documents, each containing the following items:

  • company_name
  • tags
  • text
  • external_link

Let's start by querying Solr from Django's API: First, we open a django shell:

    python manage.py shell

Further, we import what we need: SearchQuerySet and AltParser (more about AltParser here http://django-haystack.readthedocs.org/en/latest/inputtypes.html#haystack.inputs.AltParser)

    >> import haystack
    >> from haystack.query import SearchQuerySet
    >> from haystack.inputs import AltParser

Now we are ready to query Solr. Let's list all results first:

    >> docs = SearchQuerySet()
    >> docs
    [<SearchResult: Anunturi.Firm (pk=u'3')>,
     <SearchResult: Anunturi.Firm (pk=u'4')>]

As we can see, the results are listed in order by pk, first 3, then 4.

Say we want to search for the word "someone" in all documents. We will indicate that we want to search in both text and company_name fields. We will also indicate that we need a minimum match of 1.

    >> docs = SearchQuerySet()
    >> filter = AltParser(('dismax','someone',qf='company_name text', mm=1))
    >> filtered_results = docs.filter(content=filter)
    >> filtered_results
    [<SearchResult: Anunturi.Firm (pk=u'4')>,
     <SearchResult: Anunturi.Firm (pk=u'3')>]

As we can notice, the order of the results is different now. First is number 4, then second is 3. The explanation is that the word "someone":

  • is only found in the company_name field of document no. 4
  • is found in the text field of both documents (no. 3 and no. 4)

Thus, document no. 4 has the searched word not only in the text field, but also in the company_name, an advantage that places it first in the search query set.

Using the AltParser we can also boost fields at query time. Say we wanted to alter the default list order by boosting a field(let's say by 30 points).

    >> docs = SearchQuerySet()
    >> filter = AltParser(('dismax','someone',qf='text^30 company_name', mm=1))
    >> filtered_results = docs.filter(content=filter)
    >> filtered_results
    [<SearchResult: Anunturi.Firm (pk=u'4')>,
     <SearchResult: Anunturi.Firm (pk=u'3')>]

Here we indicated that the text field has more importance in our search, so documents who contain the word "someone" in this field should be listed first. That's why again, no. 4 is first and no. 3 is second.

You can find more about how to boost stuff using Haystack here http://django-haystack.readthedocs.org/en/v2.4.1/boost.html?highlight=boost

Email-attachment via Mandrill, using JavaScript / jQuery

In this article, we'll see how to send an email with file attachments via Mandrill, using JavaScript & jQuery.

THE PROBLEM
Mandrill API expects a base64 string in order to create a file attachment from it, so we'll need to obtain this string on the client-side. It also expects a file type and a file name which we'll get from the file object (also on the client-side)

THE SOLUTION
We'll use HTML5's FileReader class to get the name, type and base64-content of the file, locally, on the client-side. We need to save this data somehow, so, for this article we'll be using hidden form-inputs to store these 3 values. However, for production purposes you could use LocalStorage or other client-storage solutions.
We'll then send our form-data (e.g. sender's name, sender's email, message) and file-data (file-name, file-type, file-contents) via AJAX to a Mandrill endpoint which will process the email further.

Let's start by creating an HTML page with the following contents:

<html>
<head>
  <title>JavaScript Email</title>
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js" type="text/javascript"></script>
</head>
<body>
  <form>
    <table>
      <tr>
        <td>
          <input type="text" id="from_email" placeholder="Email" />
        </td>
      </tr>
      <tr>
        <td>
          <input type="text" id="from_name" placeholder="Name" />
        </td>
      </tr>
      <tr>
	<td>
	  <textarea placeholder="Message" cols="40" rows="5" id="message"></textarea>
	  <input type="hidden" id="file_name" />
          <input type="hidden" id="file_type" />
          <input type="hidden" id="file_btoa" />
	</td>
      </tr>
      <tr>
	<td>
	  <input type="file" id="file_content" />
        </td>
      </tr>
      <tr>
	<td>
	  <button type="button" id="send_mail">Send mail</button>
        </td>
      </tr>
    </table>
  </form>
</body>
</html>


The html file should look like this:

form.PNG

Further, we'll choose a file from the filesystem in order to send it as an attachment. From this file we need 3 things: name, type and contents. Let's use the code below to study a bit the file-object that we obtain when we select the file. Place the code right before ending the html body.

<html>
...
<body>
...
<script type="text/javascript">
$("#file_content").change(function(){
  console.log( $(this) );
});
</script>
</body>
</html>


So now, when we choose an image from the file-system, we can analyze the file-object in our console. As you can see below, we are already able to obtain the name and the type of the file, in order to use them later:

analiza.PNG

Let's save these attributes as values of our hidden form-inputs (file_name and file_type), by replacing the console logging with some meaningful lines:

<html>
...
<body>
...
<script type="text/javascript">
$("#file_content").change(function(){
  //console.log( $(this) );
  $("#file_name").val( $(this).context.files[0].name );
  $("#file_type").val( $(this).context.files[0].type );
});
</script>
</body>
</html>


Click "Choose" again and select a file. Now, if we inspect the textarea element, we can observe below it that the first two hidden inputs now have values:

inspect.PNG

However, the third hidden input, file_btoa is still empty and we need to populate it. This field will contain the base64 string content obtained from the file's binaries via the FileReader class. This is the method that we'll use to send the attachment, so please complete the javascript code so it looks like this:

<html>
...
<body>
...
<script type="text/javascript">
$("#file_content").change(function(){
  //console.log( $(this) );
  $("#file_name").val( $(this).context.files[0].name );
  $("#file_type").val( $(this).context.files[0].type );
  // Instantiate the FileReader class
  var reader = new FileReader();
  // Obtain URL data from file contents
  reader.readAsDataURL($(this).context.files[0]);
  reader.onload = function(evt){
    baseString = evt.target.result;
    // Obtain base64 from URL data
    string_btoa = baseString.substr(baseString.indexOf(",") + 1);
    // Set input value with the base64 string of the file
    $("#file_btoa").val(string_btoa);
  }
});
</script>
</body>
</html>


Now, when selecting a file, you should be able to see something like this:

btoa.PNG

Note that, for this article, I'm using input values as a storage method. However, you should consider using dedicated client-storage environments such as HTML5's LocalStorage (http://www.w3schools.com/html/html5_webstorage.asp) for keeping this type of key-value pairs.

Now, it's time to use the data we've obtained in order to send the email. You'll need to log into your Mandrill account. If you don't have one, you can create it here: https://mandrill.com/signup/
Once logged in, you should generate an API key. The screenshot below should give you a hint.

mandrill.PNG

Now that we have all the data that Mandrill needs, we're ready to send the email. Please complete the javascript so that it looks like below.

<html>
...
<body>
...
<script type="text/javascript">
$("#file_content").change(function(){
  //console.log( $(this) );
  $("#file_name").val( $(this).context.files[0].name );
  $("#file_type").val( $(this).context.files[0].type );
  // Instantiate the FileReader class
  var reader = new FileReader();
  // Obtain URL data from file contents
  reader.readAsDataURL($(this).context.files[0]);
  reader.onload = function(evt){
    baseString = evt.target.result;
    // Obtain base64 from URL data
    string_btoa = baseString.substr(baseString.indexOf(",") + 1);
    // Set input value with the base64 string of the file
    $("#file_btoa").val(string_btoa);
  }
});

// Prepare and send email
$("#send_mail").click(function(){

  // Set email content
  var contentHtml = $('#message').val() + '<br />' + $('#from_name').val() + '<br />' + $('#from_email').val();

  // Send email
  $.ajax({
    type: "POST",
    url: "https://mandrillapp.com/api/1.0/messages/send.json",
    data: {
      'key': 'xxxxxxxxxxxxxxxxxxxxxxxxxx', // Put your API key in here
      'message': {
        'from_email': $('#from_email').val(), // Our email input-value
        'from_name': $('#from_name').val(), // Our name input-value
        'to': [{
            'email': '2e23wc+2auusrkqefns4@sharklasers.com', // Inbox by www.guerrillamail.com
            'type': 'to'
          }],
        "attachments": [{
            "type": $("#file_type").val(), // Our file type input-value
            "name": $("#file_name").val(), // Our file name input-value
            "content": $("#file_btoa").val() // Our base64 string input-value
          }],
        'autotext': 'true',
        'subject': 'New test email',
        'html': contentHtml
      },
    }
  }).done(function(response) {
    console.log(response); // if you're into that sorta thing
  });
});
</script>
</body>
</html>


That's all!

Please take note that his article is more about how to obtain the data you need from the file on the client-side and pass it to a Mandrill API endpoint via Ajax. Whether or not the emails get sent, is a Mandrill problem.

You can always check the status of the emails you send via Mandrill in the "Outbound" section of your account. However, even if the status is "Delivered" that doesn't mean your mail was sent successfully. Your best bet is to click on "Delivered" and see if there are any STMP events.

For the ease of testing, I recommend using some disposable email service as your inbox. For example, Guerilla Mail ( http://www.guerillamail.com ) offers a good temporary email service.

Also, try testing from a server, so you can offer a trustworthy IP address.

GitHub – MariusIlina

MariusIlina