varchar(max) in an Index

There is one more attendee question from SQLSaturday #118 that I want to follow up on, and it is this: “Is there any way to create an index with a varchar(max) column as the leading column?”

This question came up during my Statistics presentation, and was at the end as we were winding down.  My reply was, “I have no idea, I’ve never tried it.”

The attendee noted that he had tried to create the index but couldn’t – so it cannot be done – but he was looking to see if there was a creative way to work around it.  During the ensuing conversation he wondered if maybe it couldn’t be the leading column because the index would be too large, and then mentioned that he had ended up making in an included column in the index.  My question was, “Well if it’s an included column what does that do to the size of the index?”

Then I stopped and said something to the effect of, “Hang on, what is the problem you’re trying to solve?  Meaning, I’m guessing you’re trying to put a varchar(max) column in an index because you want to search on it.  Is making it an included column going to help that much?  I suppose you’ll get an index scan instead of a table scan, but is that really the right solution?”

My new favorite question to ask is, “What problem are you trying to solve?”  In this case, I assume that users need to search on the data in the varchar(max) field.  And this can be done in SQL Server a couple different ways.  My first thought was Full Text Search.  Now, configuring FTS is going to take more work than creating an index with an included column, but I would bet that performance and management long-term would be a lot easier.  You could also create a computed column on a substring of the varchar(max) column, and put that in an index.  There are many assumptions you have to handle here, most notably, is the data you want to search always in the same location within the original column?

I’m sure there are other options, but the overall point is this: understand what problem you’re trying to solve, and then figure out what options you have for solving it.  Once you have your options, understand the pros and cons of each, and then make an informed decision.

Circling back to the original question, one of the first things I tried when I was waiting at the Madison airport for my flight home was to create a table with a varchar(max) column and create an index with that as a leading column.  It of course failed, and Books Online clearly states:

Columns that are of the large object (LOB) data types ntext, text, varchar(max), nvarchar(max), varbinary(max), xml, or image cannot be specified as key columns for an index.

So there you have it, two presentations at SQLSaturday in Madison, and three blog posts from questions.  SQLSaturdays always reaffirm several things for me.  First, I am truly grateful for the SQL Community – the people, the opportunities, the fun…all of it.  Second, I love presenting, although I tend to think of it as teaching and even said during one of my sessions, “If you have questions please come ask me after class.”  Apparently I was having flashbacks to grad school.  And finally, I will never know enough; there is always so much more to learn.


13 Responses to varchar(max) in an Index
  1. Aaron Bertrand
    May 2, 2012 | 9:29 am

    Full-Text Search is easier than a lot of people make it sound. The problem is that many folks are minimalist and un-check that option during SQL Server setup if they don’t think they’re going to use it.

    On the flip side, there are always little barriers to usage, for example people starting out with SQL Server Express with Advanced Services, who want to try out Full-Text Search, can’t even create an index with the UI without stumbling upon this bug I reported this morning:

    Anyway you make a great point, and I see people do this all the time. “How do I ride my cow to work?” Some people focus on the obstacles presented there… outfitting the cow with the saddle, teaching him traffic rules, etc. Real thinkers step back and ask, “Why do you want to get to work on a cow?”

  2. Erin Stellato
    May 2, 2012 | 9:36 am

    I would think that the performance benefits of Full-Text would outweigh any setup work – but interesting point about people not installing it during setup. I have to think it’s not a ton of space to install the additional files.

    And here I’ve been getting frustrated that it takes 2 hours to get to work on my cow…I should be riding a horse!


    • Aaron Bertrand
      May 2, 2012 | 9:40 am

      A horse would be better, yes!

      And I don’t think it’s about space, it’s just a general “I don’t need that” attitude. I’m the same way about replication support. I don’t install it because I know I’m not going to use it, but I do make more informed decisions if I’m installing SQL Server for a customer.

      Now, it’s of course not difficult to add this after the fact, but if you had to convince a user to learn and understand how FTS is better *and* you have to walk them through installing it, it’s just one additional barrier.

  3. Kevin Boles
    May 2, 2012 | 10:02 am

    Nice post, and for more than just the technical review.

    Grad school? That’s new information for me. What did you pursue?

  4. Leonard
    May 2, 2012 | 1:29 pm

    Aaron – that might be my new favourite quote. Plus I REALLY want to see someone teach a cow traffic rules.

    Overall, nice post Erin.

  5. Chuck Rummel
    May 3, 2012 | 8:00 am

    In those situations I like to ask “What does God need with a starship?” and see if the people I’m asking know what my point is (and if they’re Star Trek fans.)

  6. […] varchar(max) in an Index – A fundamental lesson in problem solving that is so often overlooked due to over emphasis on the solution, shared courtesy of Erin Stellato (Blog|Twitter). […]

  7. John Sterrett
    May 5, 2012 | 11:51 pm

    Your blog post is an perfect example why I love SQL Server. There is always something new to learn. 🙂

  8. Jeremy
    July 10, 2013 | 10:39 am

    Would you shy away from adding varchar(max) columns as included columns in indexes in general? I am creating a covering index, but hesitant to ‘fully cover’ it because there are 3 varchar(max) columns being selected (along with the other columns). Would it make sense to create the covering index w/o the 3 varchar(max) columns and have key lookups get those, or is including them not going to incur as much overhead as I am thinking.



    • Erin Stellato
      August 26, 2015 | 12:41 pm


      A very belated reply here 🙂 But, I would add those columns only in specific circumstances. If I were searching on that column, I probably would not add it. If I were DISPLAYING it in a search result, consistently, then I might add it. You have to do the work to see if the benefit of adding it outweighs the cost. A query might run faster and be more efficient, but is it “faster enough” and efficient enough that it’s worth the cost of maintaining that column in the index, and the disk space (and space in memory) it consumes?


  9. Pardeep
    August 26, 2015 | 11:38 am

    Adding nonkey columns to the index uses more disk space to store the index. In particular, adding varchar(max), nvarchar(max), varbinary(max), or xml data types as nonkey columns may significantly increase disk space requirements, because the column values are copied into the index leaf level and also remain in the table or clustered index.\

    • Erin Stellato
      August 26, 2015 | 12:45 pm

      Correct, this is also something to consider.

Leave a Reply

Wanting to leave an <em>phasis on your comment?

Trackback URL