Get more insights from Confluence pages with CSV data export

16 Feb 2023
Confluence data can be exported to CSV via the overview component of Breeze

Breeze is a powerful Confluence app that allows you to establish content lifecycle management through periodic reviews and archiving workflows to keep Confluence free of clutter. With the new CSV data export feature, Breeze has become even more powerful, enabling you to export your data to a CSV file for further analysis.

CSV stands for “Comma Separated Values,” a simple file format used to store data. CSV files can be opened and edited using spreadsheet software like Microsoft Excel, Google Sheets, or Apple Numbers. Due to their flexibility, CSV files can also easily be imported to data warehouses like Google Big Query to perform large-scale data analysis by migrating data from various sources.

How to export Confluence data to CSV with Breeze

To export your Confluence data to a CSV file with Breeze, follow these simple steps:

  1. Go to the Breeze app in Confluence.
  2. Click on the overview component.
  3. Click on the export button and decide to export data of the selected space, or of all spaces with an assigned workflow.

☝️Please notice: The duration of the export process can vary depending on your selection and the number of pages in your spaces. Maintaining the open browser tab while Breeze completes the export and saves the CSV file to your device is essential. This way, you can ensure that the process is successfully completed without any interruptions.

Generating insights and custom reports

The CSV file contains all pages of the selected spaces and for each page includes the space key, the space name, the page ID, the page title, the content relevance, the ‘last viewed’ date, the ‘last updated’ date, the page creator, the last contributor, a list of all contributors, and the page owner.

Using this data, you can easily create custom reports utilizing your favorite spreadsheet application. Let’s look at three common use cases that will bring your content management to a new level.

Providing a content quality report about all spaces

A content quality report provides a high-level view of the quality of content across all Confluence spaces by calculating an overall quality score for each space.

By identifying spaces that have low-quality scores, content owners and editors can prioritize their efforts and focus on updating and improving the most critical content first. Identifying low-quality spaces further helps to identify areas where content creators may need additional training or support to enhance the user experience and increase user satisfaction.

Moreover, a high-level view of the overall quality of content in Confluence can provide senior management with insights, allowing them to make data-driven decisions about content management strategies and investment.

To provide an overview of all Confluence spaces, you need to consider the “Space key” and “Content relevance” columns of the CSV file. For each space key, calculate the percentage of its pages having the content relevance “Up to date.”

Spreadsheet with corresponding columns to provide a content quality overview

Identifying all pages without a page owner

Pages without a page owner may not receive the same level of attention or updates as pages actively maintained by a responsible owner. By identifying these pages and assigning an owner, organizations can ensure that all content is maintained to a high standard. It further helps to ensure that the content is kept up-to-date and accurate and that someone is accountable for any issues or questions.

To identify pages without owner, you need to consider the “Space key,” “Page ID,” and “Page owner” columns of the exported CSV file. For each space key, you can parse the “Page owner” column for empty cells and extract the Page ID of the corresponding row.

Spreadsheet with columns to identiy pages without page owener

This way, you can generate a list of all pages without owners and also provide page URLs by using this schema:

URL schema for creating links to Confluence pages

Identifying all pages with unlicensed page owners

Confluence’s licensing model requires all users who create or edit pages to have a valid license. Unlicensed page owners may be former employees or users who no longer need access to Confluence. When people leave your organization, it is a good idea to review their page ownership and decide whether their pages are still relevant. Leaving their pages unmonitored can lead to blind content management spots and a lack of accountability.

To identify pages with an unlicensed owner, you again need to consider the “Space key,” “Page ID,” and “Page owner” columns of the exported CSV file and parse the “Page owner” column for cells containing the substring “(unlicensed).”

Spreadsheet with columns to identify unlicensed page owners

To generate a list, extract the space key and page ID and construct URLs with this schema:

URL schema for creating links to Confluence pages

Overall, Breeze’s CSV data export feature is the perfect tool for getting more out of your Confluence pages. It allows you to analyze your data in a more structured way and gain valuable insights that can be easily shared with others. This helps to make data-based decisions, streamline content management workflows, and boost your productivity!

Ready to clean your Confluence spaces and get rid of outdated pages?

Start your 30-day free trial now!

Try Breeze today