Image of What is a YAML file?

ADVERTISEMENT

Table of Contents

Introduction

YAML is a recursive acronym for YAML Ain't Markup Language. YAML is a human-readable data storage format. It acts as a suitable alternative for JSON and XML for most common programming tasks, including network transmission and cross-language data transfer. It is frequently used to store configuration data or other information that needs to be edited by a person and processed by a program. In this article, we will be taking a look at the structure and syntax of a YAML file.

YAML file example

In YAML, a document is a set of objects intended to be parsed together and stored into a single output.

Below is an example of a GitHub Actions workflow, stored as a YAML document in a single file. This file describes a single workflow job that runs whenever content is pushed to the master branch of a repository. As you can see, the file structure uses nested nodes (a node is simply an object defined by a line of text in the YAML file) to convey information. Node levels and nesting are managed based on the amount of leading whitespace before each node, i.e. any nodes with the same amount of leading whitespace are at the same level.

name: AWS-CD

on:
  push:
    branches:
      - master
jobs:
  build:
    runs-on: ubuntu-latest
      steps:
        - uses: actions/checkout@v1
        - name: Install Node packages
          run: npm install
        - name: Build the project
          run: npm run build
        - name: Deploy to S3
          run: AWS_ACCESS_KEY_ID=${{ secrets.AWS_ACCESS_KEY_ID }} AWS_SECRET_ACCESS_KEY=${{ secrets.AWS_SECRET_ACCESS_KEY }} aws s3 sync dist s3://${{ secrets.S3_BUCKET_NAME }}/

This example job contains several steps:

  • Checkout the current branch
  • Install all Node packages
  • Build the project
  • Deploy it to an AWS S3 bucket

You can read more about how this is used in our article on continuous deployment to AWS S3.

In this example, YAML representation is useful because it allows the data defining this job to be represented clearly and with minimal markup characters. This makes it easy for developers and programs to understand it.

YAML Mappings

The most frequently used YAML collection is the mapping. It associates a unique key with a particular value. This is comparable to maps in Java and dictionaries in Python. Here is a quick example of a map of city capitals using YAML, where each state is a key and each city is a value:

New York: Albany
Texas: Austin
California: Sacramento

YAML Sequences

Sequences are ordered collections of values. They are similar to lists or arrays in most other programming languages. One notable difference from languages like C# and Java, however, is that YAML collections are not strongly typed. A YAML list can contain strings, numbers, and booleans all at once. This is similar to Python and Javascript.

- One
- 2
- 3.45

Sequences and mappings can be placed inside of each other. Nesting them in this way is done using indentation instead of brackets, similar to Python’s indentation scopes.

	sequenceInsideMap:
		- anItem
		- mapInsideSequenceItem1: a
		  mapInsideSequenceItem2: b

YAML Scalars

Scalars are single values, as opposed to collections. YAML natively supports strings, integers, floating-point numbers, booleans, dates, and datetime timestamps:

String: Hello world!
Integer: 42
Float: 12.345
Boolean: true
Date: 2020-12-25
Datetime: 2020-12-25T08:00:00.0Z

These types are generally set by the application parsing the data, but you can explicitly define the type of a scalar using the !!type operator, as seen below. This ensures that programs parsing arbitrary YAML interpret the content correctly, especially in cases where the parser’s guess is inaccurate. For example, without the !!str operator below, a parser would incorrectly interpret this string as a boolean:

word: !!str true

You can also use a null value as a scalar value, either by writing null or by omitting a value from a mapping or sequence.

YAML Strings

Strings have a few properties not shared by other scalars. First, string literals denoted with the vertical bar character | preserve all newlines. For example, the following YAML file is equivalent to {"longString": "one\ntwo\nthree"} in JSON:

longString: |
	one
	two
	three

You can also use folded scalars >. The difference between the two is that while string literals preserve all newline characters, folded scalars treat newline characters as spaces. For example, the following YAML file is equivalent to {"longString": "one two three"} in JSON:

longString: >
	one
	two
	three

YAML Comments

Comments are denoted with a pound sign #. They are not processed by YAML parsers and should be used to provide information to future file maintainers.

str: Test  # This is a comment; its content will not be included in the string

YAML Anchors

Anchors are used to create a reference to a node - a single object stored in the current YAML file such as a string, a mapping. This reference can then be used later to repeat the original node’s content, allowing YAML file maintainers to write a node once and reuse it anywhere.

techCompanies:
	- Apple
	- &AMZN Amazon  # “&AMZN” is a node anchor that references the “Amazon” string
	- Microsoft
retailCompanies:
	- Walmart
	- *AMZN  # This will be replaced with “Amazon” in the parser output
	- Newegg

YAML Document Start/End

As we mentioned above, a document in YAML is a set of objects intended to be parsed together and stored into a single output. By default, each file only contains one document, but YAML includes a mechanism for defining multiple documents in one file. A series of three dashes --- can be used to denote the start of a document, while a series of three dots ... can be used to specify its end. This is useful for long-lived communication channels or for files that should contain several distinct entries.

---
name: Halloween
date: 2020-10-31
...
---
name: New Year’s Day
date: 2021-01-01
...

Conclusion

In this article, we discussed the structure and syntax of a YAML file.

If you're interested in learning more about coding basics, we wrote the Coding Essentials Guidebook for Developers, which covers core coding concepts and tools. It contains chapters on computer architecture, the Internet, Command-Line, HTML, CSS, JavaScript, Python, Java, SQL, Git, and more.

We hope you enjoyed this post! Feel free to shoot me an email at jacob@initialcommit.io with any questions or comments.

Final Notes