Filtering API Data with Plain JavaScript
GraphQL is Cool, but…
Published: 2022-09-17
2022-08-19 | Idea |
2022-08-26 | Draft |
2022-09-17 | Published |
Table of Contents
Background
For my non-programming friends, some background is in order as to why this is a problem in the first place. (click to expand)
In the beginning, there was the web. It wasn’t quite formless and void, but it was very basic. Just HTML. At some point JavaScript was invented, and then CSS. People started to use it for more and more things, and eventually applications – not just pages – started to be written. Sometimes the applications needed to talk to each other. And thus APIs – Application Program Interfaces – were born.
It took a while for APIs to look like they do today, and there are still a few ways to make them, but the general principle is that there’s a URL that serves data, rather than a web page. For example, if you wanted to see the metadata for my homepage, you could go to
http://sveltekit-prerender/__data.json
or expand this for a prettier version.
This particular data is in a format called JSON, one of the most common formats for such data today. If it makes you say 😵💫 I am sorry. But computers find it quite easy to read!
So we have a URL serving data, rather than a web page, but… what is this about filtering? Well, dear reader, say we have several applications, and they all want slightly different pieces of this data. What should we do then?
One solution would be to say no, we are going to send everyone all the data, and each application can ignore what it doesn’t want. This is simple, but a giant waste of data.
Another solution would be to say okay, we’ll make a special URL for each application, and then everyone can have exactly what they want. This turns out to be not only a lot of work, but very easy to break. What happens when an application changes, and wants slightly different data? Or an application wants different pieces of data at different times? Or an application stops using one URL, and lets you know so you can delete it, but secretly another application started using the old URL, and now deleting it breaks everything? No fun at all.
A third solution, and the one I like best is to set up a single URL (or a small collection of them), and have each application ask for what it wants when it wants it. Good communication: a bit more work, but improves the quality of relationships, even for programs 😉.
Of course, just like there are many different ways for people to communicate what they need, there are many different ways for programs to do it too. It would be helpful to agree on a common language at least. And fiiiiinally, this is where GraphQL comes in: it’s a common language that applications can use to say what they want, and servers can use to respond.
So what am I talking about here? Making a new language, just to confuse everybody?
Sort of, but not quite. I think it’s fun to play with languages. So this is an experiment along those lines :-)
We’ll get more technical below.
Introduction
I was recently implementing a small API, and found myself wanting to filter out some of the data before sending the response. All of the data was computed during build time, and it really wasn’t that much… so using GraphQL would have been a bit silly. But doing it manually… nah, might as well let it send a few duplicated fields.
GraphQL felt like a cool idea though, and I wondered how close I could get with just a little bit of JavaScript. I wrote a simple filter function
/**
* Fill in null entries of `query` with data from `data`
*/
export function filter(data, query) {
for (const key in query)
if (query[key] === null) query[key] = data[key];
else filter(data[key], query[key]);
return query;
}
and tried it out
import { log_json } from "../util.js";
import { filter } from "./filter.js";
console.log("// one");
log_json(
filter(
{
one: 1,
two: 2,
three: 3,
},
{
two: null,
}
)
);
console.log("\n// two");
log_json(
filter(
{
one: { one: 11, two: 12 },
two: { one: 21, two: 22 },
three: { one: 31, two: 32 },
},
{
one: { one: null },
two: null,
}
)
);
// one { "two": 2 } // two { "one": { "one": 11 }, "two": { "one": 21, "two": 22 } }
and wow. That actually does everything I need it to do.
All done!
Just kidding. This got me thinking: how much of GraphQL could we implement before it started to get complicated? Let’s try. We’ll stop when we get tired :-)
Setup
To keep tests shorter, we’ll set up some test data here.
/**
* Modified from the Apollo GraphQL Star Wars server example data
* - retrieved 2022-08-23
* - from https://github.com/apollographql/starwars-server/blob/main/data/swapiSchema.js
* - under the MIT license https://github.com/apollographql/starwars-server/blob/main/LICENSE
*/
class Human {
id;
name;
appearsIn;
homePlanet;
height;
mass;
constructor(data) {
Object.defineProperties(this, {
friends: {
set: (value) => {
this._friends = value;
},
get: () => this._friends.map((e) => idMap[e]),
enumerable: true,
},
starships: {
set: (value) => {
this._starships = value;
},
get: () => this._starships.map((e) => idMap[e]),
enumerable: true,
},
});
for (const key in this) this[key] = data[key];
}
}
class Droid {
id;
name;
appearsIn;
primaryFunction;
constructor(data) {
Object.defineProperties(this, {
friends: {
set: (value) => {
this._friends = value;
},
get: () => this._friends.map((e) => idMap[e]),
enumerable: true,
},
});
for (const key in this) this[key] = data[key];
}
}
class Starship {
id;
name;
length;
constructor(data) {
for (const key in this) this[key] = data[key];
}
}
// ----------------------------------------------------------------------------
export const humans = [
new Human({
id: "1000",
name: "Luke Skywalker",
friends: ["1002", "1003", "2000", "2001"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
homePlanet: "Tatooine",
height: 1.72,
mass: 77,
starships: ["3001", "3003"],
}),
new Human({
id: "1001",
name: "Darth Vader",
friends: ["1004"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
homePlanet: "Tatooine",
height: 2.02,
mass: 136,
starships: ["3002"],
}),
new Human({
id: "1002",
name: "Han Solo",
friends: ["1000", "1003", "2001"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
height: 1.8,
mass: 80,
starships: ["3000", "3003"],
}),
new Human({
id: "1003",
name: "Leia Organa",
friends: ["1000", "1002", "2000", "2001"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
homePlanet: "Alderaan",
height: 1.5,
mass: 49,
starships: [],
}),
new Human({
id: "1004",
name: "Wilhuff Tarkin",
friends: ["1001"],
appearsIn: ["NEWHOPE"],
height: 1.8,
mass: null,
starships: [],
}),
];
export const droids = [
new Droid({
id: "2000",
name: "C-3PO",
friends: ["1000", "1002", "1003", "2001"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
primaryFunction: "Protocol",
}),
new Droid({
id: "2001",
name: "R2-D2",
friends: ["1000", "1002", "1003"],
appearsIn: ["NEWHOPE", "EMPIRE", "JEDI"],
primaryFunction: "Astromech",
}),
];
export const starships = [
new Starship({
id: "3000",
name: "Millennium Falcon",
length: 34.37,
}),
new Starship({
id: "3001",
name: "X-Wing",
length: 12.5,
}),
new Starship({
id: "3002",
name: "TIE Advanced x1",
length: 9.2,
}),
new Starship({
id: "3003",
name: "Imperial shuttle",
length: 20,
}),
];
// ----------------------------------------------------------------------------
const humanData = {};
humans.forEach((human) => {
humanData[human.id] = human;
});
const droidData = {};
droids.forEach((droid) => {
droidData[droid.id] = droid;
});
const starshipData = {};
starships.forEach((ship) => {
starshipData[ship.id] = ship;
});
// ----------------------------------------------------------------------------
const idMap = { ...humanData, ...droidData, ...starshipData };
Basic Filtering
The function above does everything I need it to, but it doesn’t begin to approach what a GraphQL query can do. To start with, it can’t filter fields of object in arrays. Let’s add that,
/**
* Return an object with the same shape as `query`, with data from `data`
*
* - `data` can be an array or a non-array object
* - `query` should be a non-array object or null
*/
export function filter(data, query) {
// fill in null entries
if (query === null) return data;
// filter arrays of objects
if (Array.isArray(data)) return data.map((obj) => filter(obj, query));
// filter recursively
// - put results in `out` instead of modifying the original query, otherwise
// we can't use `query` multiple times, like we need to when filtering
// arrays of objects above
let out = {};
for (const key in query) out[key] = filter(data[key], query[key]);
return out;
}
and give it a try.
import { log_json } from "../util.js";
import { droids } from "../data.js";
import { filter } from "./filter.js";
log_json(
filter(droids, {
name: null,
friends: {
name: null,
homePlanet: null,
},
})
);
[ { "name": "C-3PO", "friends": [ { "name": "Luke Skywalker", "homePlanet": "Tatooine" }, { "name": "Han Solo" }, { "name": "Leia Organa", "homePlanet": "Alderaan" }, { "name": "R2-D2" } ] }, { "name": "R2-D2", "friends": [ { "name": "Luke Skywalker", "homePlanet": "Tatooine" }, { "name": "Han Solo" }, { "name": "Leia Organa", "homePlanet": "Alderaan" } ] } ]
This took more effort, since there were more cases to consider, but it seems to work. We have our base case (query === null
), we handle the case that data
is an array, and for all other cases we assume that query
and data
are both objects, go through the values in query
one by one, and filter recursively.
Error Handling
Before we move on, this might also be a good time to add some error handling and remove empty arrays and objects. It makes our code a bit more complicated (and much longer) but it will be important later.
/**
* Return an object with the same shape as `query`, with data from `data`
*/
export function filter(_data, _query) {
// check arguments
// - `query` should be `null` or a non-array object
if (
!(_query === null) &&
!(typeof _query === "object" && !Array.isArray(_query))
) {
return { error: "invalid arguments" };
}
// fill in null entries
if (_query === null) return { data: _data };
// filter arrays of objects
if (Array.isArray(_data)) {
let out = [];
for (const obj of _data) {
let { data, error } = filter(obj, _query);
// handle errors
if (error !== undefined) return { error };
// only keep non-empty results
if (
!(Array.isArray(data) && data.length === 0) &&
!(typeof data === "object" && Object.keys(data).length === 0)
) {
out.push(data);
}
}
return { data: out };
}
// filter recursively
let out = {};
for (const key in _query) {
let { data, error } = filter(_data[key], _query[key]);
// handle errors
if (error !== undefined) return { error };
// only keep non-empty results
if (data !== undefined) out[key] = data;
}
return { data: out };
}
We’ll give this a try too.
import { log_json } from "../util.js";
import { droids } from "../data.js";
import { filter } from "./filter.js";
console.log("// no error");
log_json(
filter(droids, {
name: null,
friends: {
homePlanet: null,
},
})
);
console.log("\n// error");
log_json(filter(droids, []));
// no error { "data": [ { "name": "C-3PO", "friends": [ { "homePlanet": "Tatooine" }, { "homePlanet": "Alderaan" } ] }, { "name": "R2-D2", "friends": [ { "homePlanet": "Tatooine" }, { "homePlanet": "Alderaan" } ] } ] } // error { "error": "invalid arguments" }
Matching
What if we want to only retrieve fields for objects that match some criteria – say, objects with a given ID? Let’s see what we can do.
We’ll remove the restrictions on what kind of values query
can have.
- If it’s
null
or a non-array object, we’ll do the same thing as before. - If it’s a different type of value (but not an array), let’s only return objects with the same value.
- If it’s an array of values, let’s only return objects that match one of them.
If an object doesn’t match, we’ll return an error saying so, and filter those out of the result.
/**
* Return an object with the same shape as `query`, with data from `data`
*/
export function filter(_data, _query) {
// fill in null entries
if (_query === null) return { data: _data };
// filter arrays of objects
if (Array.isArray(_data)) {
let out = [];
for (const obj of _data) {
let { data, error } = filter(obj, _query);
// handle errors
if (error === "match failed") continue;
if (error !== undefined) return { error };
// only keep non-empty results
if (
!(Array.isArray(data) && data.length === 0) &&
!(typeof data === "object" && Object.keys(data).length === 0)
) {
out.push(data);
}
}
return { data: out };
}
// match
if (typeof _query !== "object")
if (_data === _query) return { data: _data };
else return { error: "match failed" };
if (Array.isArray(_query))
if (_query.includes(_data)) return { data: _data };
else return { error: "match failed" };
// filter recursively
let out = {};
for (const key in _query) {
let { data, error } = filter(_data[key], _query[key]);
// handle errors
if (error === "match failed") return { data: {} };
if (error !== undefined) return { error };
// only keep non-empty results
if (data !== undefined) out[key] = data;
}
return { data: out };
}
Because of all the work we did above, that wasn’t too bad!
We’ll give this a quick test
import { log_json } from "../util.js";
import { humans, droids } from "../data.js";
import { filter } from "./filter.js";
console.log("// match non-array in object");
log_json(
filter(droids, {
id: "2000",
name: null,
friends: {
name: null,
},
})
);
console.log("\n// match array in object");
log_json(
filter(humans, {
id: ["1000", "1001"],
name: null,
})
);
// match non-array in object { "data": [ { "id": "2000", "name": "C-3PO", "friends": [ { "name": "Luke Skywalker" }, { "name": "Han Solo" }, { "name": "Leia Organa" }, { "name": "R2-D2" } ] } ] } // match array in object { "data": [ { "id": "1000", "name": "Luke Skywalker" }, { "id": "1001", "name": "Darth Vader" } ] }
and then move on to our final section.
Advanced Matching
If you give a programmer the ability to match fields exactly, they’re going to want… the ability to match fields inexactly! Of course. So let’s see what we can do.
The way filter
is written right now, it doesn’t make much sense to have _query
be something like [{...}]
. We could certainly change that (perhaps by expanding our matching code), but instead let’s use it: If _query
is an array, and its first value is an object, we’ll use the fields of that object for special functions.
I’ll only implement a few for now – enough to see how you could implement more if you wanted to.
/**
* Functions for special queries.
*
* - return `undefined` to do nothing
* - if `true` is returned, `filter` should return `{ data }`
* - unless `query` is `false` in which case it should return
* `{ data: undefined }`. this allows us to filter based on a field without
* including that field in the output.
* - if `false` is returned, `filter` should return `{ error: "match failed" }`
* - if an object is returned, `filter` should return that object
*/
const specialFunctions = {
/** Do nothing. This field is used elsewhere. */
query: () => {},
/**
* Paginate, by returning a slice of `data`.
*/
page: ({ data, opt }) => {
if (!Array.isArray(data))
return { error: { opt, message: "page: data must be an array" } };
return { data: data.slice(opt.start, opt.end) };
},
/**
* Return `true` if `data` is a string, and starts with `opt`.
*/
startsWith: ({ data, opt }) => {
if (typeof data !== "string")
return { error: { opt, message: "startsWith: data must be a string" } };
return data.startsWith(opt);
},
/**
* Return `true` if `data` includes `opt`.
*/
includes: ({ data, opt }) => {
if (!Array.isArray(data))
return {
error: { opt, message: "includes: data must be an array" },
};
return data.includes(opt);
},
};
/**
* Return an object with the same shape as `query`, with data from `data`
*/
export function filter(_data, _query) {
// special query
// - a query of the form `[{...}, ...]`
// - anything in the array besides the first object is ignored for now
let specialQuery;
if (Array.isArray(_query) && typeof _query[0] === "object")
specialQuery = _query[0];
// special queries can have regular queries inside
let query = _query;
if (specialQuery) query = specialQuery.query;
// fill in null entries
// - including when a special query's query is `null`
if (query === null) return { data: _data };
// filter arrays of objects
if (Array.isArray(_data) && typeof _data[0] === "object") {
let out = [];
for (const obj of _data) {
let { data, error } = filter(obj, query);
// handle errors
if (error === "match failed") continue;
if (error !== undefined) return { error };
// only keep non-empty results
if (
!(data === undefined) &&
!(Array.isArray(data) && data.length === 0) &&
!(typeof data === "object" && Object.keys(data).length === 0)
) {
out.push(data);
}
}
// special query: pagination
if (specialQuery?.page)
return specialFunctions.page({ data: out, opt: specialQuery.page });
return { data: out };
}
// special queries
for (const key in specialQuery) {
if (!(key in specialFunctions))
return { error: { field: key, message: "unknown special query field" } };
const value = specialFunctions[key]({
data: _data,
opt: specialQuery[key],
});
if (value === undefined) continue;
if (value === true)
if (query === false) return { data: undefined };
else return { data: _data };
if (value === false) return { error: "match failed" };
if (typeof value === "object") return value;
return { value, message: "unknown special function return value" };
}
// match
if (typeof query !== "object")
if (_data === query) return { data: _data };
else return { error: "match failed" };
if (Array.isArray(query))
if (query.includes(_data)) return { data: _data };
else return { error: "match failed" };
// filter recursively
let out = {};
for (const key in query) {
let { data, error } = filter(_data[key], query[key]);
// handle errors
if (error === "match failed") return { data: {} };
if (error !== undefined) return { error };
// only keep non-empty results
if (data !== undefined) out[key] = data;
}
return { data: out };
}
The first thing to notice is that this is getting not only long, but complicated. There were several times while trying things out that I had to step through the code because I wasn’t sure what was going on, and many of those times I found a mistake.
I’ve also started to notice little idiosyncrasies that would take some work to smooth out. For instance, if you want to match a field without including it in the output, you say [{ query: false, ... }]
. Why false
? Because null
was already taken and undefined
won’t make it through JSON.stringify
. Made sense to me at the time, still makes sense to me now, but I really don’t know if it was the best choice. This is starting to become a language design issue, and those take a bit more care than I can spend on this right now.
And if you read through carefully, I’m sure you’ll discover many things that I’ve missed. But I think I am officially tired ;-)
Tests
Before we stop, I promised you tests! In real life, these would be automated in a testing framework (I actually used Vitest to help with coverage data while writing these out), but for illustration purposes I think this is more helpful.
It’s the last version of our filter
function, and it’s gotten quite complicated, so we’ll do a more thorough job testing this time around.
fill in null entries
import { log_json } from "../util.js";
import { droids } from "../data.js";
import { filter } from "./filter.js";
console.log("// fill in null entries");
log_json(
filter(droids, {
name: null,
})
);
// fill in null entries { "data": [ { "name": "C-3PO" }, { "name": "R2-D2" } ] }
matching
import { log_json } from "../util.js";
import { humans } from "../data.js";
import { filter } from "./filter.js";
console.log("// match");
console.log("\n// value");
log_json(
filter(humans, {
id: "1000",
name: null,
})
);
console.log("\n// array");
log_json(
filter(humans, {
id: ["1000", "1001"],
name: null,
})
);
console.log("\n// subfield value");
log_json(
filter(humans, {
id: "1000",
name: null,
friends: {
id: "1002",
name: null,
},
})
);
console.log("\n// subfield array");
log_json(
filter(humans, {
id: "1000",
name: null,
friends: {
id: ["1002", "1003"],
name: null,
},
})
);
// match // value { "data": [ { "id": "1000", "name": "Luke Skywalker" } ] } // array { "data": [ { "id": "1000", "name": "Luke Skywalker" }, { "id": "1001", "name": "Darth Vader" } ] } // subfield value { "data": [ { "id": "1000", "name": "Luke Skywalker", "friends": [ { "id": "1002", "name": "Han Solo" } ] } ] } // subfield array { "data": [ { "id": "1000", "name": "Luke Skywalker", "friends": [ { "id": "1002", "name": "Han Solo" }, { "id": "1003", "name": "Leia Organa" } ] } ] }
pagination
import { log_json } from "../util.js";
import { humans } from "../data.js";
import { filter } from "./filter.js";
console.log("// pagination");
console.log("\n// without pagination");
log_json(filter(humans, [{ query: { name: null } }]));
console.log("\n// entries 0..2");
log_json(
filter(humans, [
{
page: { start: 0, end: 2 },
query: { name: null },
},
])
);
console.log("\n// entries 2..4");
log_json(
filter(humans, [
{
page: { start: 2, end: 4 },
query: { name: null },
},
])
);
console.log("\n// entries 4..end");
log_json(
filter(humans, [
{
page: { start: 4 },
query: { name: null },
},
])
);
// pagination // without pagination { "data": [ { "name": "Luke Skywalker" }, { "name": "Darth Vader" }, { "name": "Han Solo" }, { "name": "Leia Organa" }, { "name": "Wilhuff Tarkin" } ] } // entries 0..2 { "data": [ { "name": "Luke Skywalker" }, { "name": "Darth Vader" } ] } // entries 2..4 { "data": [ { "name": "Han Solo" }, { "name": "Leia Organa" } ] } // entries 4..end { "data": [ { "name": "Wilhuff Tarkin" } ] }
special queries
import { log_json } from "../util.js";
import { humans } from "../data.js";
import { filter } from "./filter.js";
console.log("// special queries");
console.log("\n// starts with");
log_json(
filter(humans, {
name: [{ startsWith: "L" }],
})
);
console.log("\n// includes");
console.log("\n// --- with field");
log_json(
filter(humans, {
name: null,
appearsIn: [{ includes: "JEDI" }],
})
);
console.log("\n// --- without field");
log_json(
filter(humans, {
name: null,
appearsIn: [{ query: false, includes: "JEDI" }],
})
);
// special queries // starts with { "data": [ { "name": "Luke Skywalker" }, { "name": "Leia Organa" } ] } // includes // --- with field { "data": [ { "name": "Luke Skywalker", "appearsIn": [ "NEWHOPE", "EMPIRE", "JEDI" ] }, { "name": "Darth Vader", "appearsIn": [ "NEWHOPE", "EMPIRE", "JEDI" ] }, { "name": "Han Solo", "appearsIn": [ "NEWHOPE", "EMPIRE", "JEDI" ] }, { "name": "Leia Organa", "appearsIn": [ "NEWHOPE", "EMPIRE", "JEDI" ] } ] } // --- without field { "data": [ { "name": "Luke Skywalker" }, { "name": "Darth Vader" }, { "name": "Han Solo" }, { "name": "Leia Organa" } ] }
errors
import { log_json } from "../util.js";
import { humans, starships } from "../data.js";
import { filter } from "./filter.js";
console.log("// errors");
console.log("\n// paginate non-array");
console.log("\n// --- data");
log_json(filter(starships[0], [{ query: { name: null } }]));
console.log("\n// --- error");
log_json(filter(starships[0], [{ page: { start: 0 }, query: { name: null } }]));
console.log("\n// starts with non-string");
console.log("\n// --- data");
log_json(filter(humans, { name: [{ startsWith: "L" }] }));
console.log("\n// --- error");
log_json(filter(humans, { mass: [{ startsWith: "L" }] }));
console.log("\n// includes non-array");
console.log("\n// --- data");
log_json(filter(humans, { name: [{ startsWith: "L" }] }));
console.log("\n// --- error");
log_json(filter(humans, { name: [{ includes: "L" }] }));
console.log("\n// unknown special query field");
console.log("\n// --- data");
log_json(filter(humans, { name: [{ startsWith: "Le" }] }));
console.log("\n// --- error");
log_json(filter(humans, { name: [{ fake_field_name: "L" }] }));
// errors // paginate non-array // --- data { "data": { "name": "Millennium Falcon" } } // --- error { "error": { "opt": { "start": 0 }, "message": "page: data must be an array" } } // starts with non-string // --- data { "data": [ { "name": "Luke Skywalker" }, { "name": "Leia Organa" } ] } // --- error { "error": { "opt": "L", "message": "startsWith: data must be a string" } } // includes non-array // --- data { "data": [ { "name": "Luke Skywalker" }, { "name": "Leia Organa" } ] } // --- error { "error": { "opt": "L", "message": "includes: data must be an array" } } // unknown special query field // --- data { "data": [ { "name": "Leia Organa" } ] } // --- error { "error": { "field": "fake_field_name", "message": "unknown special query field" } }
Conclusion
I’m pretty happy with what we accomplished here. We took our little filter function and grew it into something that can do some fairly sophisticated things. Most importantly, we had fun thinking about how all these things could be done.