Crack password of documents - Word, Excel, Pdf - security concerns

There are many tools/software out there claiming they can recover or crack the password of documents and all these tools are getting good customer attention. But the question here is how many of you have succeeded with such tools! Most of these tools share common methods or concepts explained below and we should really think about success rate of such tools before spending money on them.

I believe if a document is strongly protected (128 bit encryption), as of now the possibility of cracking the password and open the document is almost nil! All we can do is try our luck (yes, I really mean it) by these tools. And if we are really lucky, voila, we got the password or else we can leave the password cracking tool running for millions of years!!

This article is for learning purpose only and trying to show vulnerability of legacy RC4 40 bit encryption. 

Common type of attacks (methods of cracking)

Password cracking tools usually uses any/all of the below basic methods, with some mix and match on the approach.

1. Dictionary attack

As in the name, in this method the tool will try to open the document by trying a set of all possible words/combination of words from an exhaustive list - called a dictionary. This again can be dictionary + numbers, number + dictionary + special characters etc. Usually there will be options where user can mention what kind of dictionary should be used, how many characters of length tool should try etc.

Dictionary attack usually will take less time to complete compare to Brute force attack, which may run forever!! I highlighted ‘to complete’ because completing the attack does not mean the password has been recovered. It may complete without any result and finally say ‘we could not find the password’ – damn heh.

2. Brute force attack

While dictionary attack try to open the document by trying all the possible combination of dictionary words, the brute force will try to open the document by trying all the possible combination of alphabets, numbers, special characters, punctuations etc. Example like, start with a, then aa, then aaa a….., then next b, bb, bbb, b….other combinations ab, ac, ad etc. Each tool would be having their own method for these combinations and also there will be options like password minimum and maximum length, what kind of characters should be included – lowercase, uppercase, only alphabets etc. If we could mention at least some option, there is a chance that you can recover the password. If you have no idea about password, then very very minimal chance to get the password.

Ok, well, is there any guaranteed method?

The good news (bad news to someone) is that, yes, we have some guaranteed method that will not recover the password but will remove the  password (decrypt the document) so that we can open the document with out password.

The bad news (good news to someone) is, not all documents, especially documents created and protected using latest versions of software like MS office 2010 etc, can be unprotected. The reason being they use stronger 128 bit encryption and other such methods to protect the document.

So what kind of protection can be broken?

Well, if the document is encrypted using RC4-40 bit encryption, then we can break the encryption. Earlier versions of MS office (97/2000/XP/2002/2003) uses RC4 encryption and out of these MS Office 97/2000 uses only 40 bit encryption. XP/2002 and 2003 provided with an option to choose different encryption methods. Hence for XP/2002 and 2003, we can unprotect document only if it has been protected using default method – that is if user had not chosen any advanced option while setting his password.

How do we unprotect (decrypt) the document then?

Before we check this, it is better to know what is RC4 encryption. RC4 is a stream cipher encryption algorithm which is used to encrypt text streams (of documents in our case). Once encrypted, to decrypt it uses a key which is generated out of a password.

So here is the catch, it uses a key to decrypt the document. So if we have the key we can decrypt the document without a password.

How to get the encryption key?

The method to get the key remains same, brute force. But here the advantage is, the time taken for brute forcing the key is very very minimal compare to brute forcing the password. As it is a 40 bit encryption, the entire scope of the key is 240(called key space) combinations. So we will try all these possible combination to get the key, which can be then used to decrypt the document.

To search the entire key space with now-a-days advanced processors (Intel dual core, core 2 etc.), and with a single process (one thread), it will take only couple of days – may be a week. And if we use more advanced processor/ or multiple process (threads) it will break in minutes to hours.

See the below chart from Wikipedia showing theoretical limit of brute force attack on various levels of encryption.

Source

Do you still believe if someone/tool claim he can brute force a 128 bit encrypted document and waste your money?

And there is one more new concept called GPU accelerated technique, which uses the power of graphic cards to brute force and the RC4 key can be brute forced in seconds or minutes!! Oh..man.

Once we got the key, then we can decrypt the document using common RC4 algorithm. (You can get this algorithm in various programming languages from the internet, so no worry)

How to validate the encryption key?

This question would have come to your mind when I said we will try all possible key combinations to check which key is the right one to decrypt. Well in case of word documents, the verification is doing against verifier string/characters (encrypted verifier) which is stored in the RC4 encryption header in the document. So we should also know a little about document headers and the document structure.

You may read more about the MS Word document header structure here.

Source

If you would like to know how programmatically read the document header, what the structure of a document is, how we validate the key against the verifier string and very importantly, how technically we can crack the document please check my next post on this Crack password - RC4 40 bit decryption of documents - If I include that also, this document will run pages.

I would also suggest you to do a search on Internet using some of the new terms and key words learned here to learn further if you are interested.

Security concerns – last but not least

Hence if you are really concerned about the security of your documents, I would recommend you to always select 128 bit and other advanced security options from the option ‘Advanced’.

Also when you plan to buy a password cracking/recovering tool think twice.

If you like to publish your own article on hubpages register here

More by this Author


Comments 6 comments

Georgina Nibbet 5 years ago

Awesome software, pdfs are often really tough to work with. Another way to do this though would be to just go here where there are programs that will automatically turn excel sheets into pdfs.


danfresnourban profile image

danfresnourban 5 years ago from Fresno, CA

Great hub, very informative. I am going to bookmark this hub because I know that I will want to refer back to it at another time.


psf profile image

psf 5 years ago from Canada Author

Thanks for the comment, danfresnourban.


Joseph Becket 5 years ago

The article is very helpful! Thank you, I'll be sure to use your advice. Personally, I had some problems converting excel files over to PDF. Then I found this program at my work that quickly and painlessly converts my excel documents into pdf so I didn't have to do it manually (which would end up taking aggravating hours upon hours). It provides my business a simple way to make individualized statements for our business associates. Check it out here.


Andrew 5 years ago

Hi, Can you please provide me with a free program that will recover or remove the password from a word doc...i can't open the file at all as it needs the password as soon as you try to open it...i've tried lots from the web but are no use, possibly require brute force one. Thanks


Gavin Smith 3 years ago

I am developing a free program to remove the 40-bit RC4 encryption on Excel documents with the default Excel 2003 encryption, which should save some people from risking paying for a product which may not work.

It currently supports scanning for a valid decryption key, and supports decrypting with that key. Support for also decrypting Word documents is also planned for the future.

The project repository is at https://github.com/GavinSmith0123/crackxls2003. A Windows executable is included, which should be used at the command line. I would welcome reports of whether others have successfully downloaded and run this program.

    Sign in or sign up and post using a HubPages Network account.

    0 of 8192 characters used
    Post Comment

    No HTML is allowed in comments, but URLs will be hyperlinked. Comments are not for promoting your articles or other sites.


    Click to Rate This Article
    working