The information collected over time can tell someone everything about you and can be used to help or harm you. This data is known as metadata.
What is Metadata?
Metadata is data about data. Don’t you hate it when people use a word to define that word? Data is facts that are completely objective. It cannot be reasonably disputed. One is one and zero is zero. The temperature is what it is. Today’s date is today’s date. You get the point. Metadata is facts about a single piece of data. Let’s illustrate this with the example of writing a letter in Microsoft Word. You type a letter to send to authorities about something going on at work that is unethical, possibly even illegal. Yes, we got dramatic right away. The letter is information made from the characters you typed in a certain order. The characters you typed are generated by a bunch of ones and zeroes. The ones and zeroes are the data that makes up the information in the letter. When the ones and zeroes are turned into the letter, there is other data about the letter that gets created. Some of that metadata includes when you typed the letter, who typed the letter, when it was last saved, what version of Word was it created in, all these are data about the data or metadata.
What Does Metadata Do?
In Microsoft Office, most of the metadata is there just for your benefit. It can help you find the newest version of a document or see who created the document so you could ask them questions about it. It helps to keep track of edits or comments on documents. It is also used by the Office program and other programs to work with the document. Windows Explorer uses the information to categorize and sort documents, for example.
Why Would I Want to Remove Metadata?
Let’s go back to the letter you’re sending to the authorities about something sketchy going on at work. You’re doing this anonymously because you fear retribution, or you just don’t want to be involved beyond bringing it to the authorities’ attention. It’s whistleblowing. You go all out and get a temporary e-mail address and send it from a public computer at a library to cover your tracks. Because of metadata, the document may have information that can be used to link it back to you. It even may still have your name attached to it. Even worse, changes you made to the document, although no longer visible to you, may still be in the document. If you wrote a paragraph about something specific to you but then removed it because it could be used to identify you, it still could be a part of the file in the form of metadata.
How Can I View Office Metadata?
Following is a list of methods to see what metadata is attached to your Word, Excel, or PowerPoint files. Metadata surrounding e-mail sent from Outlook is far more complex and beyond the scope of this article.
View Metadata in Word, Excel, or PowerPoint
With the document, workbook, or presentation open that you want to check: Click on File in the top-left corner. On the Info screen, you’ll see plenty of information such as Size, Pages, Words, Total Editing Time, Last Modified, Created, and Related People among other data. Under that data, click on Show All Properties to see more data. NOTE: Pay attention to the Template data. If you used a template that has your name, or a company name in its filename, that could be tracked to you.
View Metadata in Windows Explorer
Open Windows Explorer and navigate to where you have saved the file. Right-click on the file and click on Properties. In the Properties window, click on the Details tab. You’ll see all the metadata in compact and concise list.
View All Metadata for Word, Excel, or PowerPoint Files
Extensible Markup Language (XML) is the defacto document for the storage of metadata in computing. It accompanies all kinds of files and Microsoft Office files are no exception. Viewing these XML documents is surprisingly easy. Let’s do this with a Word file. Change the extension of the file from .docx to .zip. Yes, each Office filetype that ends in x is a compressed file containing XML documents. You’ll get a warning about doing this. Click Yes. Right-click on the file and select Extract All… In the window that opens, it will ask you where you want to save the extracted files and if you want to show the extracted files when finished. The default values are good. Click Extract. Once the extraction is done, you’ll see three folders and an XML file. Explore in these files to see what information is stored there. If you double-click on an XML file, it will likely open in Internet Explorer. It will look odd, but you should be able to make out what most of the information means. There are two XML files that may contain your name: core.xml in the docProps folder, and document.xml and people.xml, both in the word folder.
How to Delete Metadata from Microsoft Word, Excel, or PowerPoint
It took a long time to get here, but if you’re going to do something like this you should know exactly why. Let’s get on with it.
Delete Metadata in Word, Excel, or PowerPoint
Click on File in the top-left corner. On the Info page, click on Check for Issues on the left, near the middle of the page. Click on Inspect Document. The Document Inspector window will open. Make sure all the checkboxes in the Document Inspector are checked, then click the Inspect button. Once the Document Inspector is done, you’ll see information about what kind of data it found. A green checkmark in a circle means it found no data of that type. A red exclamation mark means it found data of that type. Next to that data type’s description you’ll see the Remove All button. Click on that to remove all data of that type. There may be several of these buttons, so scroll down to ensure you get all of them. After you’ve removed the metadata, you may want to click the Reinspect button, just to make sure it didn’t miss anything. Save your document now to ensure the data doesn’t get re-entered.
How to be Certain the Metadata was Deleted
Go through the steps above for View All Metadata in Word, Excel, or PowerPoint. Upon inspecting the core.xml, document.xml and people.xml files, you should see that there is no personal data in the document anymore. If you change the extension back from .zip to .docx, you’ll be able to open the file normally in Word again.
How to Delete Metadata in Windows Explorer
This is a good method if you want to strip metadata from several files quickly. Doing this for 2 or more files can be done in a matter of seconds. Open Windows Explorer and navigate to the file you want to remove metadata from. Right-click on the file and click on Properties. In the Properties window, click on the Details tab then click on Remove Properties and Personal Information. You can remove information in two ways. You can remove metadata from the original file or make a copy of the file without any metadata.
Remove Metadata from Original File
Select Remove the following properties from this file: then either check only the boxes you want or click on the Select All button. Then click OK.
Make a Copy with No Metadata
This will make a copy of the file and add the word Copy to the end of the filename. That copy will not have any metadata associated with it. In the Remove Properties window, select Create a copy with all possible properties removed then click the OK button. Compare the properties of the original and the copy to see the difference.
In the Clear?
Does this mean you’re in the clear? You cannot be identified now from the document? That’s difficult to say. What you do with the document next will determine that. Any further digital processing of the document, like emailing it, could add metadata back into the chain. A viable option is to print the document and mail it. It’s difficult to get metadata from paper.