Varchar Vs Char

Every once in a while I’ll find a field someone created where the datatype choice was not the best. This comes up relatively often with Varchar vs Char. All of the confusion just comes from not understanding the differences and causes and effects of each. With that, I’ll outline the only times I will use one over the other.

Let’s take a look at the behavior of char vs varchar:

CREATE TABLE #meme
(
  first_name CHAR(50),
  last_name VARCHAR(50)
)

INSERT INTO #meme
(first_name, last_name)
SELECT 'john', 'smith'

SELECT * FROM #meme

SELECT DATALENGTH(first_name), DATALENGTH(last_name)
FROM #meme

You can see from the value returned from DATALENGTH that first_name is 50 bytes long despite only taking 4 characters. The other characters are empty strings.

Storage wise, varchar is out the gate up to 2 bytes larger than a char because it needs to store the length. It’s 1 byte larger if it’s over 255 characters. That’s why you will often see varchar definitions as varchar(255). Pain-in-the-butt wise, char is far more a pain to deal with unless all the data in the char column takes up the exact amount of space defined for the char column. I typically only use char if all the fields are going to be the exact same size guaranteed. Even then, I’m hesitant. Partly because disk space is not as much of an issue as it was before, and mostly because if you DO have variable length fields in a char column, then the remaining spaces you don’t fill up, will be padded with empty spaces. This forces you to use the LTRIM(RTRIM()) debacle for every row you return.

varchar and char only store single byte character sets. For double byte use nvarchar or nchar

There are potential situations where you would clearly want to use char over varchar. Suppose you had an initial value with a small data length that was going to be updated with a large data length. In other words, if you defined a varchar(255) that initially got populated with the word “Hi”. And over time, this field continually got updated and the size of it continued to grow to fill the whole 255 max defined size. In this situation you would have bad fragmentation because as the size of the field grow, if there was not enough room on the data page another page would have to be created to house the new data.

Also, if just for storage purposes and no retrieval, you may want to consider char if space will be a factor for you. As you can see, there is no cut and dry answer for your decision to use char over varchar, but overall if you have a good maintenance plan that deals with fragmentation and you plan to retrieve the data often, and the data is variable length that doesn’t vary widely, varchar is probably the better choice.

5 comments
Comandant 29 Aug 2015 at 5:15 am

Hi Brent,I was wondering if you had any ignhsit to the current state of database performance with the new Server/Sql 2008 options at Amazon EC2. You had an article a while back which criticized the performance of Sql 2005 on the 2003 servers they had, saying that the throughput was the equivalent of running your db off a thumb-drive. It appears like they’re trying to address some of the previous limitations of Windows on EC2 (server version, boot drive, etc.) and I was wondering if this translated to performance benefits.Cheers,Rob

Edward 29 Jan 2013 at 2:12 pm

Thanks Derek this solved one of the PITA issues when I was trying to compare two string values that were stored as char.. I ll keep that in mind in the future.

Good Stuff

Derek Dieter 31 Jan 2013 at 6:01 pm

No problem EP.. yeah char’s can be a pain when comparing them..

Justin 29 Aug 2015 at 8:35 am

you’ve studied graph throey in computer science classes you’ll remember that the order of child nodes in a tree is irrelevant to the graph. That is, (R (F) (M) (L)) is the same tree as (R (M) (L) (F)).So keep in mind that display order data properly should be separate from the hierarchy data.But since path enumeration is essentially a denormalized solution already, perhaps we can bend the rules.Therefore you could just make sure to update the primary key values of tree nodes, so that the natural order of these values matches your desired display order.This is admittedly a fragile solution, and pretty much a hack of coupling the sibling order to the primary key values (which should be arbitrary and independent of meaning).This illustrates that each solution to represent trees in SQL has its strengths and weaknesses, and therefore which design you use should be based on the types of queries you need in a given application.It would take a whole book to compare how to implement every scenario in all the designs of trees. In my book “SQL Antipatterns,” I had to limit myself to 20 pages on hierarchical data, because there are other topics to cover.

Wfaa 29 Aug 2015 at 5:16 am

Not to be overly anclatiyal about the new server options, but they do in fact feel zippier. It’s definitely cool to see how fast they boot and even cooler that they can sit idle and uncharged in the stopped state. Neither of these perks address day-to-day operations however.Maybe in the New Year I could provide the server(s) and you could provide the skill(s) or at least direction to dig into it a little more. Thanks for < 1 min. reply,Rob

Featured Articles

 Site Author

  • Thanks for visiting!