Bazel Rules: Multiple Source Files

2021-12-29

Intro #

Things get slightly more complicated when we need to use multiple source files. Let’s imagine that we need to concatenate several shell scripts. Our BUILD file could use the srcs attribute to keep up with Bazel conventions:

load(":rules.bzl", "demo_binary")

demo_binary(
    name = "multiple_source_files",
    srcs = [
        "english.sh",
        "french.sh",
    ],
    out = "hello",
)

For the sake of the example, each of the files will echo a greeting in the corresponding language:

echo 'Hello, World!'

Given these inputs, the script I’d like to build should look like this:

#!/bin/sh
echo 'Hello, World!'
echo 'Bonjour monde!'

Rule Changes #

First, we need to modify the rule definition to introduce the new attrubuite using the label_list function. Notice that we’re using allow_files instead of allow_single_file.

demo_binary = rule(
    implementation = _demo_binary_impl,
    executable = True,
    attrs = {
        "out": attr.output(mandatory = True),
        "srcs": attr.label_list(
            mandatory = True,
            allow_files = [".sh"],
        ),
    },
)

Implementation Function #

A simple way to make this work involves the run_shell Bazel action. It allows us to execute typical shell commands and use pipe and redirections. Generally, run_shell is a very convenient instrument. At the same time, it can introduce various issues with hermeticity, portability, injection, and escaping, so it is best to use it in a fully controlled and trusted environment or when we are willing to accept a quick and dirty solution.

We will store the required command in a private variable and then concatenate the cat arguments using a Python idiom with a join function and a list comprehension. We will have to pass the inputs to run_shell explicitly. Otherwise, the files will not be available during the execution stage. The rest of the function should be familiar.

_SCRIPT = "echo '#!/bin/sh' > {out} && cat {srcs} >> {out}"

def _demo_binary_impl(ctx):
    out = ctx.outputs.out
    cmd = _SCRIPT.format(
        srcs = " ".join([p.path for p in ctx.files.srcs]),
        out = out.path,
    )
    ctx.actions.run_shell(
        inputs = ctx.files.srcs,
        outputs = [out],
        command = cmd,
    )
    return [DefaultInfo(
        files = depset([out]),
        executable = out,
    )]

As usual, we can build the script and verify the script:

$ bazel build //multiple_source_files
...
Target //multiple_source_files:multiple_source_files up-to-date:
  bazel-bin/multiple_source_files/hello
...
$ cat bazel-bin/multiple_source_files/hello
#!/bin/sh
echo 'Hello, World!'
echo 'Bonjour monde!'

Notice that the script preserves the order of the source files to make the build deterministic. Finally, the complete example is in the repo.

bazel

Bazel Rules: Source Bazel Rules: Macros