Basic Processes, Terms and Techniques in Data Analysis

Updated on January 3, 2013
Windows computers crashing to the 'Blue Screen of Death'.  Public Domain photo courtesy of Grj23.
Data analysis isn’t exactly rocket surgery, but if you don’t know how all the parts of the jigsaw puzzle fit together it can get complex fast. Whether you’re applying for your first job in data management and statistics, or you’ve been thrown in at the deep end in a company with little or no training beyond the corporate Key Performance Indicator (KPI) bible, this article will take you through a whistle-stop tour of the things you need to know.

I've outlined a couple of the common software programmes used in data analysis, like Crystal Reports and Excel, but these are only examples of some of the most prevalent - there are hundreds of others tailored to specific business areas. If you are applying for a job, check the advertisement for mention of specific software. Even if you don't have experience using it, just a little background knowledge of how similar software or programming languages work will go a long way in an interview.

Photo by Varano. Licence: CC-BY-SA.
Audits can be external, internal or self-audits. Unless a potential problem or anomaly has been identified, external audits will be infrequent and will usually spot-check just a small amount of raw data and perhaps double-check any formulas used to calculate official or statutory measures like KPIs. Internal audits may be more rigorous and include an overhaul of any spreadsheets that are regularly used to calculate report data. Self-audits are informal and happen on a day-to-day basis as you spot errors or obtain unexpected results.

Balanced Scorecard

This is usually a collection of the most important measures of a company’s success in various key areas (whether these measures are KPIs or otherwise). It is basically the same as a Red-Amber-Green (RAG, or traffic-light) report, highlighting important points that managers and colleagues can identify quickly through the use of standard symbols and colour-coding to show where the company is improving and where it is under-performing.

Chair-to-Keyboard Interface (CKI)

This is what staff in the I.T. department call you, me, and anyone else who uses the computers they fix. Don’t be offended – they spend all their days telling people to ‘turn it off and then turn it back on again.’ It’s of small surprise that they can be a little grumpy occasionally.

Crystal Reports

Crystal Reports is a data interrogation program from a company called SAP (Systems, Applications and Products in Data Processing) that acts as an interface between a database and the analyst. How well Crystal Reports works depends largely on the strength of design of the database it is attached to: a badly-designed database will return little meaningful data no matter how many data mining tools are attached to it. If the database has structural integrity, though, and analysts are even fairly competent in the use of Visual Basic (VB), Crystal Reports can be a huge time-saver.

Data Collation

The first step towards analysing the data after it has been collected, putting it all into the right spreadsheets, checking anomalies and grouping related data together. This is the preliminary work before putting it all together coherently into reports, and it will be a part of the material that will be examined in any official or semi-official audit.

Data Flow Symbols.  Released into the Public Domain by Wooptoo
Data Collection

Your data will come in raw form – such as all the sheets of returned customer surveys that have to be entered onto a database – or more often will be extracted from databases, or obtained from colleagues in various departments of your company.

Data Manipulation

‘Manipulation’ does not mean ‘corruption’! It simply means any form of processing the raw data to produce meaningful figures. For instance, to say that 5 employees each had a day’s sickness absence is meaningless, but to say that 5 employees out of 100 total company employees gives a percentage figure (5/100 = 5% of employees had some sickness absence), which is more meaningful. The figure of 5% is not data in its raw form – it has been manipulated.

Excel (Microsoft)

A spreadsheet programme, and if you are, or intend to become, a data analyst, this is the application you will need to learn to love.

Photo by Juergen Rosskamp. Licence: CC-BY-SA.
Gap Analysis

A company wants to do X: let’s say that a pizza delivery company wants to match a rival firm’s promise to deliver to customers within 20 minutes. The analysis of what needs to be done to achieve this, and what obstacles stand in the way, is a Gap Analysis. For large companies, professional Project Managers might be employed to manage the task. If the change is a major one, the Gap Analysis can be quite complex, and involve bringing in new staff, changing KPIs, allocating funds, checking legal requirements, putting new audit systems in place and many other tasks. For smaller companies or for lesser changes it can often be a much more informal process. Identifying the X – the goal of the company - in the first place is often sparked by reports from data analysts, and the measurement of progress towards the goal is almost always assessed by further analysis of data.

High-Level Reports

High-level Reports are intended for managers and others who may not be concerned with the details and technical aspects of data. RAG reports and Balanced Scorecards are examples of high-level reports – they highlight the most critical aspects of a business.

Key Performance Indicators (KPIs)

If something is vital to a company – whether it is a measure of the profit and loss, or employee sickness absence rates, or the rate at which raw materials are ordered and used – then it is a Key Performance Indicator. Some companies, especially smaller ones, don’t use the formal term for these measures, but it’s a handy shortcut phrase. If KPIs and their data are used well, they can measure the success of a company and highlight problem areas before they get out of hand. For public bodies like local and national government departments, some KPIs are statutory and come with a whole host of related work like internal and external auditing, policy writing and quality assurance.

Low-Level Reports

Low-level reports are those prepared by technical staff for technical staff, and will contain much more detailed data. Often, these are not really in ‘report’ form, and might simply be a copy of the raw data with some commentary, for instance if an audit has found errors or anomalies in data or formulae.


The out-turn is the figure for an area that is measured by the company once all the data has been collected, collated, manipulated and analysed. Barring the discovery of any errors, the out-turn is the last word on any particular measure for the given time-period it covers.

Quality Assurance

In a large corporation there will often be a team of people whose job it is to check that products and services are of a certain standard. This can be in the form of spot-checks, customer or shareholder surveys, and a certain amount of auditing. For some products and services, quality assurance can be a legal requirement, but for many small companies it is an ad hoc process. Data Analysts may have a combined role that includes quality assurance work since it is often the data they are working on that can first highlight a problem with the company.

RAG Reports are also known as Traffic Light Reports.  Photo released into the Public Domain by Velela
RAG Reports

High-level reports for management. These are reports where key points are highlighted by the use of Red-Amber-Green colour-coded symbols so that managers and colleagues can identify at a glance critical areas that are underperforming. RAG reports are especially useful for monthly or yearly reports where the performance of a large number of KPIs must be reported.

Image by Rodrigo Mallmann Guerra. Licence: CC-BY-SA
Risk Assessment

This is similar to a Gap Analysis, and indeed sometimes the two terms are used interchangeably, but a Risk Assessment can also cover a much wider area. If a risk is identified within a company, whether it is a physical risk to employees, customers or the public, or whether it is a financial or legal risk, then a Risk Assessment usually takes place formally or informally. As an example, you have a pizza company and a burger bar opens a branch next door. You might do an assessment posing the question ‘What is the risk that they will steal all my customers?’ If the new food outlet is shiny and bright and appealing, you might think that the risk is high, and then go on to do a Gap Analysis on how you can beat your competitor by refurbishing your own shop front, getting new tables, and offering a wider variety of foods. More serious risk assessments can be around things like fire hazards or disabled access. Everyday risk assessments take place from a data analyst’s perspective, on all aspects of a company’s information, and risky trends are reported regularly to managers and colleagues. If a company is working well, there will be a smooth flow of information between departments.

Service Level Agreement (SLA)

This is part of the contract between a company providing any service, and the customer receiving that service. The SLA sets out what the customer can expect, so for instance a nursing-home provider may include things like nursing staff availability hours and the standard of nutrition to be provided in meals offered to residents; whereas a pizza delivery company might promise that a customer will receive their pizza within half an hour. SLA data can feed into KPIs, Quality Assurance, Audit trails and survey data (among others).

Photo by Casito. Licence: CC-BY-SA.
A realistic target value for any measure, whether the measure be profit-related, service level, or any other performance or business area. Ideally, targets are set based on a combination of the trend shown by previous data, and ‘vision’ (or hopes) based around any improvement projects that have been implemented. They should be realistic and possible, and a natural progression of trends and achievable forecasts. Although target-setting is often seen as the job of senior management, in reality the task often falls to the data analysts, either to advise management on target-setting, or to decide outright what the company targets will be.

Visual Basic (VB) (Microsoft)

A high-level programming language, and one that is quite simple to learn. Learning VB will give you an edge, since it works inside, or as an interface to, many other data management programmes, like Excel (which uses Visual Basic for Applications (VBA), which is a cut-down version of VB) and Crystal Reports.

