Persistent copyright & licensing information in client-side JavaScript, CSS and similar – a proposal & call for help
Introduction¶
License information in source code is best stored in each file of the source code as a comment, if at all possible. That way the license metadata travels with the file even if it was copied from its original package/repository into a different one.
Client-side JavaScript, CSS and similar languages that make up a large chunk of the web are often concatenated, minified and even uglified in an attempt to make the website faster to load. In this process, most often, the comments get culled the first to reduce the number of characters that serve no function to the program code itself.
Problem¶
The problem therefore is that typically when JavaScript, CSS (or similar client-side1 code) is being built, it tends to lose not just comments that describe the code’s functionality, but also comments that carry licensing and copyright information. And since licenses (FOSS or not) typically require the text of the license and copyright notices2 to be kept with the code, such removal can be problematic.
Proposal¶
The goal is to preserve copyright and licensing information of web front-end code even after minification in such a way that it makes FOSS license compliance3 of web applications easier.
In addition, my proposal is intended to keep things:
- as simple as possible;
- as short as possible;
- not introduce any new specifications, but rely on well-established standards; and
- not require any additional tooling, but rely on what is already in common use.
Essentially, my suggestion is literally as simple as wrapping every .js
, .css
and similar (e.g. .ts
, .scss
, …) file with SPDX snippets tags, following the REUSE specification as follows:
At the very beginning of the file introduce an “important” comment block that starts the (SPDX) snippet and includes all the REUSE/SPDX tags that apply to this file, e.g.:
/*!
* SPDX-SnippetBegin
* SPDX-CopyrightText: © 2024 Hacke R. McRandom <hacker@mcrandom.example>
* SPDX-LicenseIdentifier: MIT
*/
And at the very end of the file introduce another “important” comment block that simply closes the (SPDX) snippet:
/*! SPDX-SnippetEnd */
… and that is basically it!
How any why this works (in theory)¶
This results in e.g. a .js
file that would look something like this:
/*!
* SPDX-SnippetBegin
* SPDX-CopyrightText: © 2024 Hacke R. McRandom <hacker@mcrandom.example>
* SPDX-LicenseIdentifier: MIT OR Unlicense
*/
import half_of_npm
code_goes_here();
/*! SPDX-SnippetEnd */
and a .css
file as follows:
/*!
* SPDX-SnippetBegin
* SPDX-CopyrightText: © 2020 Talha Mansoor <talha131@gmail.com>
* SPDX-LicenseIdentifier: MIT
*/
}
pre {
overflow: auto;
white-space: pre;
word-break: normal;
word-wrap: normal;
color: #ebdbb2; /* This is needed due to bug in Pygments. It does not wraps some part of the code of some lagauges, like reST. This is for fallback. */
}
/*! SPDX-SnippetEnd */
All JavaScript, CSS, TypeScript, Sass, etc. files would look like that.
Then on npm run build
(or whatever build system you use) the minifier keeps those tags where they are, because !
is a common trigger to keep that comment when minifying.
So if all the files are tagged as such4, the minified barf5 you get, should include all the SPDX tags in order and in the right place, so you see which license/copyright starts and ends to apply in the gibberish.
And if it pulls stuff that does not use REUSE (snippets) yet, you will still be able to tell it apart6, since it will be the barf that’s between SPDX-SnippetEnd
of the previous and SPDX-SnippetBegin
of the next properly marked barf.
Is this really enough?¶
OK, so now we know the start and end of a source code file that ended up in the minified barf. But are the SPDX-SnippetCopyrightText
and SPDX-License-Identifier
enough?
I think so, yes.
If I chose to express my copyright notice using an SPDX tag – especially if I followed the format that pretty much all copyright laws prescribe – that should be no problem7.
The bigger question is whether communicating the license solely through the SPDX IDs8 is enough, since you would technically not be including the whole license text(s). Honestly, I think it is fine.9 At this stage SPDX is not just long-established in the industry and community, but is also a formal international standard (ISO/IEC 5692:2021). Practically from its beginning – and probably the most known part of the spec – unique names/IDs of licenses and the equivalent canonical texts of those license have been part of SPDX. Which means that if I see SPDX-License-Identifier: MIT
I know it specifically means the text that is on https://spdx.org/licenses/MIT.html. Ergo, as long as you are using licenses from the SPDX License List in these tags, all the relevant info is present.
How to keep the tags – Overview of popular minifiers¶
As mentioned, most minifiers tend to remove comments by default to conserve space. But there is a way to retain license comments (or other important comments). And this method existed for over a decade now!
I have done some research into how different minifiers deal with this. Admittedly, mostly by reading through their documentation. Due to my lack of skills, I did not manage to test out all of them in practice.
But at least theoretically the vast majority of the minifiers that I was told are common (plus a few more I found) seem to support at least one way of keeping important – or even explicitly copyright/license-relevant – comments.
minifier | JSDoc-style / @license | YUI-style / /*! |
---|---|---|
Google Closure Compiler | ✔️ | ✔️ 11 |
Esbuild | ✔️ | ✔️ |
JShrink | ✔️ | ✔️ |
SWC10 | ✔️ | ✔️ |
Terser | ✔️ | ✔️ |
UglifyJS | ✔️ | ✔️ |
YUI Compressor | ❌ | ✔️ |
Bun | ❌ | ❌ (🛠️?) |
From what I can tell it is only Bun that does not support any way to (selectively) preserve comments. There is a ticket open to implement (at least) the !
method though.
While not themselves minifiers, module bundlers do call and handle minfiers. Here are notes about the ones that I learnt are the most popular ones:
- WebPack is, from what I am told, the most wide-spread module bundler. By default is uses Terser to minify things, but it can be set to use a custom minification function too;
- Rollup is another popular module bundler. For this one you need to write a custom minification function, but in its documentation Terser is used as the example;
- (for SWC see the table above, as it does its own minification).
From this overview it seems like using the /*!
comment method is our best option – it is short, the most widely supported and not loaded with meaning.
More details on both styles below.
Using @license
/ JSDoc-style¶
JSDoc is a markup language used to annotate JavaScript source code files to add in-code documentation, which is then used to generate documentation.
Looking at the JSDoc specification, the following keywords seem relevant:
– “used to document copyright information in a file overview comment” – oddly enough, this tag does not seem to be deemed preserve-worthy, so we can discount it;@copyright
@license
– “identifies the software license that applies to any portion of your code” (seems to exist since 2013);@preserve
– this one seems to be deprecated in favour of@license
, but still recognised by several minifiers, so I mention it here.
So from JSDoc it seems the best choice would be @license
. To quote from the spec itself:
The
@license
tag identifies the software license that applies to any portion of your code.You can use any text to identify the license you are using. If your code uses a standard open-source license, consider using the appropriate identifier from the Software Package Data Exchange (SPDX) License List.
Some JavaScript processing tools, such as Google's Closure Compiler, will automatically preserve any JSDoc comment that includes a
@license
tag. If you are using one of these tools, you may wish to add a standalone JSDoc comment that includes the@license
tag, along with the entire text of the license, so that the license text will be included in generated JavaScript files.
Using /*!
/ YUI-style¶
This other style seems to originate from Yahoo! UI Compressor 2.4.8 (also from 2013), to quote its README
:
C-style comments starting with
/*!
are preserved. This is useful with comments containing copyright/license information. As of 2.4.8, the '!
' is no longer dropped by YUICompressor. For example:
/*!
* TERMS OF USE - EASING EQUATIONS
* Open source under the BSD License.
* Copyright 2001 Robert Penner All rights reserved.
*/
remains in the output, untouched by YUICompressor.
Many other projects adopted this, some extended it by using also the single-line //!
, but others have not.
Also note that YUI itself does not use the double-asterisk /**
tag (if it did, it should be /**!
), whereas that is typically the starting tag in JSDoc (and JavaDoc) of a document-relevant comment block.
So from the YUI-style tags, it seem using (multi-line) C-style comments that start with /*!
is the most consistently used.
And as YUI-style seems to be the most commonly implemented way to tag and preserve (licensing-relevant) comments in JS, it would seem prudent to adopt it for our purposes – to preserve REUSE-standardised tags to mark license and copyright information in files and snippets.
A few PoC I tried¶
So far the theory …
But when it comes to testing it in practice, it gets both a bit messy and I very quickly reach the limits of my JS skills.
I have tried a few PoC, and ran into mixed, yet promising, results so far.
The most issues, I assume, are fixable by simply changing the settings of the build tools accordingly.
It is entirely possible that a lot of the issues are PEBKAC as well.
Svelte + Rollup + Terser¶
The simplest PoC I tried is a Svelte app that uses Rollup as a build tool and Terser as the minifier – kindly offered by Oliver “oliwerix” Wagner as a guinea pig. I left the settings as they are, and the results are mostly fine.
First we pull the 29c9881
commit and build it with npm install; npm run build
.
In public/build/
we have three files: bundle.css
, bundle.js
, bundle.js.map
.
The bundle.css
file does not have any SPDX-*
tags, and I suspect this is because it consist solely of 3rd party components, which do not use these tags yet. The public/global.css
is still referred to separately in public/index.html
and retains the SPDX-*
tags in its non-minified form. So that is fine, but would need further testing to check the minified CSS.
The bundle.js
file contains the minified JS and the SPDX-*
tags remain there, but with one SPDX-SnippetEnd
being misplaced.
If we compare e.g. rg SPDX- src/*.js
12:
src/store.js
2: * SPDX-SnippetBegin
3: * SPDX-SnippetCopyrightText: 1984 Winston Smith <win@smith.example>
4: * SPDX-License-Identifier: Unlicense
18:/*! SPDX-SnippetEnd */
src/main.js
2: * SPDX-SnippetBegin
3: * SPDX-SnippetCopyrightText: © 2021 Test Dummy <dummy@test.example>
4: * SPDX-License-Identifier: BSD-2-Clause
17:/*! SPDX-SnippetEnd */
… and rg SPDX- public/build/bundle.js
3: * SPDX-SnippetBegin
4: * SPDX-SnippetCopyrightText: 1984 Winston Smith <win@smith.example>
5: * SPDX-License-Identifier: Unlicense
8:/*! SPDX-SnippetEnd */
10:/*! SPDX-SnippetEnd */
13: * SPDX-SnippetBegin
14: * SPDX-SnippetCopyrightText: © 2021 Test Dummy <dummy@test.example>
15: * SPDX-License-Identifier: BSD-2-Clause
… it is clear that something is amiss. A snippet cannot end before it begins.
But when checking the public/build/bundle.js.map
SourceMap, we again see the SPDX-*
tags in order just fine.
I would really like to know what went wrong here.
React Scripts (+ WebPack + Terser)¶
Before that I tried to set up a “simple” React Scripts app with the help of my work colleague, Carlos “roclas” Hernandez.
Here, again, I am getting mixed results out of the box.
First we pull the af54954
commit and build it with npm install; npm run build
.
On the CSS side, we see that there are two files in build/static/css/
, namely: main.05c219f8.css
and main.05c219f8.css.map
.
Both the main.05c219f8.css
and its SourceMap retain the SPDX-*
tags where we want them, so that is great!
On the JS side it gets more complicated though. In build/static/js/
we have several files now, and if we pair them up:
453.2a77899f.chunk.js
and453.2a77899f.chunk.js.map
main.608edf8e.js
,main.608edf8e.js.LICENSE.txt
andmain.608edf8e.js.map
The 453.2a77899f.chunk.js*
files contain no SPDX-*
tags. Honestly, I do not know where they came from, but assume it is again 3rd party components, which do not use these tags yet. So we can ignore them.
But it is the main.608edf8e.js*
files that we are interested in.
Unfortunately, it is here that it gets a bit annoying.
It seems React Scripts is quite opinionated and hard-codes its preferences when it comes to minification etc. So even though it is easy to set-up WebPack and Terser to preserve (important) comments in the code itself, React forces it otherwise.
What this results in then is the following:
main.608edf8e.js
is cleaned of all comments – noSPDX-*
tags here;- but now
main.608edf8e.js.LICENSE.txt
has all theSPDX-*
tags as well other important comments (e.g.@license
blocks from 3rd party components); - and as for the
main.608edf8e.js.map
SourceMap, it includesSPDX-*
tags, as expected.
The really annoying bit is that it seems like main.608edf8e.js.LICENSE.txt
is not mapped, it is just a dump of all the license-related comments. So that does not help us here.
There is a workaround by injecting code and settings using Rewire, but so far I have not managed to set it up correctly. I am sure it is possible, but I gave up after it took way too much of my time already.
Some early thoughts on *.js.map
and *.js.LICENSE.txt
¶
If the license and copyright info is missing from the minified source code, but it is there in the *.js.map
SourceMap (spec), I think that is better than nothing, but I am leaning towards it not being enough for the goal we are trying to reach here.
Similarly, when the minifier simply shoves all the license comments into a separate *.js.LICENSE.txt
file, removing them from the *.js
file and without any way to map the license and copyright comments back to the source code, I do not see how this is much more useful than the *.js.map
itself.
So far, it seems to me like this is a problem caused by some frameworks (e.g. React Scripts) hard-coding their preferences when it comes to minification, without an easy way to override it.
But if there was a *.js.LICENSE.txt
(or equivalent) that was mapped13 via SourceMaps, so one could figure out which license comment matches which source code block in the minified code, I would be inclined to take that as potentially good enough.
Future ideas¶
Once the base issue of preserving SPDX tags in minified (web front-end) code in proper places is solved, we can expand it to make it even more useful.
Here is a short list of ideas that popped up already. I am keeping them hidden by default, to not detract too much from the base problem.
Nothing stops us from adding more relevant information in these tags – in fact, as long as it is an SPDX tag, that would be in line with both the SPDX standard and the REUSE spec. A prefect candidate to include would be something to designate the origin or provenance of the package the file came from – e.g. using PackageURL.
To make this even more useful in practice, it is entirely imaginable that build tools could help generate or populate these tags and therefore inject information themselves, some early ideas:
- for (external) packages that do not use REUSE/SPDX Snippet Tags, the build tool could be ordered to generate them from REUSE/SPDX File Tags;
- same, but to pull the information from a different source (e.g.
LICENSE
file) – that might be a bit dubious when it comes to exactness though; - the above-mentioned PackageURL (or other origin identifier) could be added by the build tool.
All of the above future ideas are super early ideas and some could well be too error-prone to be useful, but should be kept in mind and discussed after the base issue is solved.
Open questions & Call for help and testers¶
As stated, I am not very savvy when it comes to writing web front-ends, so at this point this project would really benefit from people with experience in building web front-ends taking a look.
If anyone knows how to get React and any other similarly opinionated frameworks to not create the *.js.LICENSE.txt
file, but keep the important comments in code, that would be really useful.
If you are a legal or license compliance expert, while I am quite confident in the logic behind this, feedback is very welcome and do try to poke holes in my logic.
If you are a technical person (esp. front-end developers), please, try applying this to code in different build environments and let me know what works and what breaks. We need more and better PoCs.
If you have proposed fixes, even better!
Comments, ideas, suggestions, … welcome.
Ultimately, my goal is to come up with a solution that works and requires (ideally) no changes in tooling and specifications.
If that means abandoning this proposal and finding a better one, so be it. But it has to start somewhere that is half-way doable.
Credits¶
While I did spend quite a bit of time on this, this would not exist without prior work and a lot of help from others.
First of all, the basic idea of treating each file as a snippet that just happens to currently span a whole file, was originally proposed to me by José Maria “chema” Balsas Falaguera in 2020 and we worked together on an early PoC on how to apply REUSE to a large-ish JS + CSS code-base … before (REUSE Snippets and later) SPDX Snippets came to be.
In fact, it was this Chema’s idea that sparked my quest to first bring snippet support to REUSE, and later to SPDX specifications.
At REUSE I would specifically like to thank Carmen Bianca Bakker, Max Mehl, and Nico Rikken for not refusing this idea upfront and also for being great intellectual sparring partners on this adventure.
And at SPDX it was Alexios “zvr” Zavras who saw the potential and helped me draft SPDX tags for snippets to the point where it got accepted.
I would also like to thank Henrik “hesa” Sandklef, Maximilian Huber and Philippe Ombredanne for their feedback and some proposals on how to expand this further later on.
Of course, none of this would be possible without everyone behind SPDX, REUSE, YUI, and JSDoc.
I am sure I forgot to mention a few other people, and if that was you, I humbly apologise and ask you to let me know, so I can correct this issue.
hook out → wow, this took longer than expected … so many rabbit holes!
Also called browser-side code. ↩
In droit d’auteur / civil law jurisdictions, the removal of a copyright statement could also qualify as a violation of civil law or even a criminal offence … when the copyright holder is an author (physical person), and not a legal entity. ↩
Both from the point-of-view of a license compliance expert and their tooling; and from (at least) the spirit of FOSS licenses. ↩
Barring any magical weirdness that happens during the minifying/uglifying that might mess things up in things I did not predict. Here I need front-end experts’ help and more testing. ↩
If unreadable binary is a blob, “barf” sounded like an appropriate equivalent for nigh-unreadable minified code. ↩
Here is where the added functionality of treating every file as a (very-long) snippet pays off. If we did not mark the start and end of each file, we would not be able to tell that in the minified code. In that case, a consequence would be that we would falsely assume a piece of minified code would fall under a certain license, whereas it would simply be part of a file that was unmarked (e.g. pulled in from a 3rd party component). ↩
Probably Apache-2.0’s requirement to include the
NOTICE
file would not be satisfied, but that is not a new issue. And there are three ways to make Apache-2.0 happy, so it is solvable if you are diligent. ↩The value of an
SPDX-License-Identifier
tag is an SPDX License Expression, which can either be just a single license or a logical statement of several licenses. (Hat-tip to Peter Monks for pointing this out.) ↩I do accept that there is a tiny bit of handwaving involved here, but honestly, I think this should be good enough, given the limitations of minified code. If the consensus ends up that full license texts need to be shipped as well, this can be done via hyperlinks, and those can be generated from the SPDX IDs. Or if you really want to be pedantic about it, upload and hyperlink the (usually missing) license texts from each JS/CSS package itself. ↩
SWC is primarily a compiler and module bundler, but I keep it here because it does its own minification instead of relying on an external tool. ↩
It is not documented but Google Closure Compiler does indeed support
!
and treats it explicitly as a license block (without any additional JSDoc parsing) since 2016. ↩I use RipGrep in this example, but normal
grep
should work too. ↩And I do not know of any that does that. ↩