Database analysts are the people who keep databases functioning well over time. They solve a problem: though most non-technical people never see it, databases decay with use. New systems, updates, and file types pose a constant threat to a company’s use of data, and database analysts protect against these threats.
In a sentence, database analysts study and maintain a database’s structure and upgrade relevant technology in order to ensure it produces value for an organization over time. Database analysts also stay up-to-date on the latest programs to ensure their database’s architecture doesn’t loose relevance.
This description may sounds like a handful, so let’s break it down into small pieces next. For a more thorough discussion of data analysis, you can download the free Intro to Data Analysis eBook.
What is a database analyst?
Organizations with large amounts of data need to store it in a database. A database analyst is a technical employee responsible for ensuring the usefulness of all data- related information and programs, including the database management system, any data visualization systems, data query language efficiency, and the data dictionary.
You can think of the database analyst like a gardener. If she takes care of the garden, then it prospers over time. But if she doesn’t water the plants and trim them back on a regular basis, the garden will shrivel and become overrun by weeds.
The same thing applies for database analyst and their databases.
Companies that need data
Databases analysts are an important part of any organization that needs data to run its operations. They’re more common in companies looking to scale, companies that need data to scale (startups like Uber and AirB&B were in this category), and large organizations with long-term customers.
Indeed, tech companies aren’t the only ones who depend on data. Banks, hotel chains, steel manufacturers, and pretty much any large organization has recognized the need for data analysis and began collecting it a long time ago.
I think it’s easy to leave the sheer range of applications a database analyst’s skills have. That’s why, when we ask what a database analyst is, it’s important to mention not only what they do, but also that they do it in almost every industry on Earth today. The one exception, perhaps, would be small businesses, in which information is simple enough for decision-makers to understand intuitively.
In my experience, database analysts personalities are also an important part of the mix. As you can imagine, database analysts tend to be analytical, detail-oriented people who like to ensure that every piece of a puzzle fits together nicely. Counter-intuitively, they are often very talkative individuals. They like to analyze topics to understand them to the very last detail.
What does a database analyst do?
We’ve already given a blanket statement for what database analysts do. They study and maintain databases to keep them fresh over time, just like a gardener keeps the garden fresh. But what do database analysts do specifically, on a day-to-day basis? Some activities include the following.
- Maintaining database viewer rights
- Maintaining formatting against decay and degradation
- Updating data dictionaries and data marts
Database Analyst Responsibilities
Let’s look more deeply into the following items: maintaining viewer rights, data decay resistance strategies, data consistency strategies, bulk updates to data, data organization and culling, data dictionary descriptions, data mart creation, data lineage documenting, and of course system updates.
1. Maintaining viewer rights
Database analysts have to manage viewer rights. In an organization there is usually one central database, but 10s of departments. Each department has different access needs, whether its for confidentiality reasons or simple ease-of-access. The database analyst must construct structures alongside the Business Intelligence teams so that data viewership is a safe and smooth experience.
This is where data wrangling and cleaning skills become very useful. The database analyst must careful construct his/her original database so that it’s easy to cut and splice columns in order to give the correct viewership to a given department. For example, consider the following table:
Customer_ID | Social Security Number (invented for example) | First purchase? | Date | Age |
---|---|---|---|---|
Jim | 568-0002x | Yes | 12/03/2018 | 32 |
Karly | 583-0948x | Yes | 11/02/2018 | 40 |
Samson | 385-0293x | Yes | 04/28/2019 | 34 |
Nick | 394-0039x | No | 06/20/2020 | 84 |
Social security numbers are highly sensitive information and are kept only for auditing purposes. No department should be able to see them. This means that the database analyst must make sure to hide this field (column) from all viewers’ access windows.
In addition, the accounting department has must submit a demographics report to local government on a monthly basis. For this report, they only need access to the Date and Age columns. The First purchase? column is excess information. The database analyst should therefore limit accounting’s viewer access window to those two columns.
Viewer authorization
Another important aspect of viewer rights is automated viewer authorization. Above we talk about permanent situations. But in reality, each department’s needs are dynamic. In many organizations they change on a daily basis.
That’s why database analysts must take care to implement automated viewer authorization processes. When well executed, these processes allows any employee requesting access to click a button, which sends a request to their supervisor. The supervisor then approved or denies the access, and the database analyst acts accordingly.
It’s very common for database analysts to use what’s called data marts to do so. Data marts are specific segments of larger databases built bespoke for the needs of each department.
2. Data maintenance against data decay & degradation
Perhaps the biggest risk for data-based organizations is the slow decay of media files over time. On a technical level, all data is stored in bytes on data storage devices. These devices grow old with time, and they accumulate non-critical failures. Hardware teams very rarely notice these non-critical failures before its too late.
But database analysts are on the front line. They can feel when the system is slowing down. What’s more, they’re up-to-date on the age of all different kinds of data, and they know where each is stored. This means they know when its time to bulk update the files.
This may sounds simple, but the upgrading process requires careful planning, as each file type has a different lifespan. Each file type also has a different impact on the company’s operations. In some cases, a sales portal may need to shut down for upgrades. Every minute of down time is money lost, so database analysts need to anticipate bulk updates and create plans on a yearly basis accordingly.
3. Data consistency measures
Database analysts also need to ensure that data entering the database or database management system is consistent. Formatting plays a huge role in data integrity. In fact, one small change in an entry can spell disaster when it comes time to visualize the data. Let’s look again at our example of customers from before:
Customer_ID | Social Security Number (invented for example) | First purchase? | Date | Age |
---|---|---|---|---|
Jim | 568-0002x | Yes | 12/03/2018 | 32 |
Karly | 583-0948x | Yes | 11/02/2018 | 40 |
Samson | 385-0293x | Yes | 04/28/2019 | 34 |
Nick | 394-0039x | No | 06/20/2020 | 84 |
Imagine that instead of writing a date in the “mm/dd/yyyy” format, you wrote it out in the “mmmm dd, yyyy” format. In many database systems, this entry would not be recognized as a date like the others. Instead, it would be recognized as a “string” element, and hold no field-relevant value. If you tried to visualize the number of Customers purchasing in that month, you would have one empty value.
Problems at scale
For one entry, it’s no problem. The problem arises when 10 different departments all name their data in a legacy format. If all the data is different, and we’re looking at 10s of 1000s of rows, in most cases it’s be to start over. The time to clean such huge amounts of data is not worth it.
Database analysts have two options for handling inconsistent data before the system gets too crazy. First, they can implement data guidelines and hope that people follow instructions. Second, they can use data validation. Data validation is limiting data inputs based on criteria. In our dates example, it would be forcing the “mm/dd/yyyy” format. If you’re unfamiliar with data validation, check out Excel’s data validation in this short video:
In short, you see that I enter three possible choices, “Yes,” “No,” and “Maybe.” Then I go to the data tab to create data validation on a new cell. I select the three possibilities, and they become the only options in the new cell. Just like that, I create data consistency. Database analysts do the same thing, but they do it at scale.
4. Data organization and data culling
We’ve talked some about data viewership and the use of data marts to create subgroups of data tailored to different departments. These are both methods of data organization, and more specifically data culling.
By definition, data culling is the process a data specialist uses to search and isolate data based on criteria. You should not confuse it with date querying, in which a user requests data based on an SQL query.
While the difference is subtle, it’s important. Data culling happens on the backend of a database, where database analysts determine structures and build the organization of viewership and viewer rights.
On the other hand, data queries are frontend requests for data from the database that can be viewed, downloaded, and sometimes, edited. Do not confuse the two – they’re different!
5. Data dictionary descriptions
Database analysts need to help people understand data. Data dictionaries are a great way to do so. In a sentence, data dictionaries are data that describe data. They provide key information about different columns in a database, such as data type and required values. Most database management systems can fill out column info automatically.
However, one of the types of data they provide are “descriptions” or “definitions.” This field is a critical piece of the puzzle because it provides the user with core information. However, someone has to manually enter a description for every field.
In large databases, this can become a full time job.
Database analysts are responsible for making sure data dictionary description and definition fields are up to date. In some companies, they spend a majority of their working hours performing column and field analyses in order to provide this information. They do it so the rest of the organization does not have to.
6. Data lineage
Data lineage is another phrase for the full history of a data point from its creation to the present. It includes information about origin, locations, alterations, and data type transfers.
The reason its so important to have data lineage is root-cause analysis. If data and business intelligence analysts can trace the entire lineage of a troublesome data point, then they can more easily identify the source of a problem.
The trouble is that data lineage is, like data dictionaries, data about data. It must be stored and protected just as much as the database itself. Database analysts are responsible for maintaining this information as well.
7. Find and procure new database software
Most departments never hear about database analysts… before it’s time to change providers or update the system. Then everyone knows about database analysts, and nobody is happy about it.
When data begins to decay, or the provider of your database software starts to make mistakes, you run into problems with data storage and manipulation. You have to make a switch.
This process can be particularly frustrating for analysts because they can see these challenges coming far in advance of the critical moment, but management hardly ever wants to make major changes without at least some immediate impact.
Database analysts must constantly be aware of the status of their systems, and be in good, transparent standing with their various providers. When it’s time to switch to a new technology, they should be ready to provide the necessary access points and information to execute a clean transfer.
I remember when my first organization went through this process. It took two weeks for the transfer to occur, and the whole organization was “blind” during that time. Suffice it to say that the team of database analysts and systems administrators were not the most popular!
Database Analyst Requirements & Education
What education and experience do you need to be a database analyst? Well, typically a database analyst has several years of experience in a simple data analyst or business intelligence position. As is the case in most technology roles, experience is more important that education.
That being said, most database analysts today graduate with at least a bachelors degree in a science, technology, engineering, or math field (STEM). Some business and finance students also make the transition.
What’s most important to note is that database analyst must be analytically minded, logical, detail-oriented, and quantitative-minded. If you plan to apply for a database analyst position, know that you will need to show your familiarity with data structures, and you may even be asked to perform a technical test.
In short, there are no formal certifications or education requirements to be a database analyst, but you should:
- Know how to manipulate data in Excel at a minimum (for a junior DB analyst role)
- Be comfortable with numbers and logic
- Have at least 1 year experience in another data-related role
This may sound intimidating, but don’t let it be. Anyone can learn the skills and build the experience needed to become a database analyst, and it’s not as hard as it sounds. Most of what scares people is the fear of not being quantitative enough, or being overwhelmed by large amounts of data.
Don’t be! Anyone can add, subtract, multiply, and divide, and the tools we use actually make data easier to understand, not harder!
Database Analyst Skills
A database analyst’s skills will depend in large part on his/her level, but we can break them down into 4 categories:
- Requirement analysis – while usually associated with business analysts, requirement analysis for database analysts is critical because they need to understand the goal of their database system before building it!
- Communication – research new technologies and techniques, as well as how the database can better serve the organization, are critical skills. Database analysts must be able to document these ideas and communicate them effectively to management. They might regularly read industry journals
- Data modeling – including entity relationship diagrams (ERDs), unified modeling language diagrams (UML diagrams), and several others
- Data analysis and manipulation – database management software to approach data, SQL to query requests from the database, advanced Excel to analyze it, and Tableau or Power BI to visualize it
How much does a database analyst make/what’s a database analyst salary?
As with skills, a database analyst salary depends on your skill level. In addition, salaries depend on your location and company. To give you an idea of the ranges, check out this table. It gives a full picture of what you can expect for a mid-level database analyst salary in different countries:
Location | Salary |
---|---|
San Francisco | $80,709 |
New York | $73,211 |
Miami | $58,590 |
Austin | $59,414 |
London | $41,962 |
Database analyst vs database administrator: what’s the difference?
While every company sees these roles differently, the rule of thumb is that database administrators are the most technical and database analysts are one degree above. Administrators may need to work directly with hardware, whereas analysts will only be involved on the software and analysis side.
Database analyst vs data analyst: what’s the difference?
Data analyst and database analysts are not the same thing. Data analysts are often members of disparate teams. They leverage their knowledge of data manipulation and data visualization in order to provide insights from specific subsets of data. Database analysts are the ones who give data analysts their subsets of data to work with!
In short, a data analyst works of pulling insights using tools and techniques on subsets of data, whereas database analysts are responsible for ensuring data analysts have access to the right data.