Welcome fellow learners, in this blog post, I'll be writing about how I created a JSON file data filterer and how you code one yourself. The program lets you filter out the unwanted value from a JSON file and generate a new file for it. This program could work mostly thanks to the recursive functions I used. You can access the source code through this Github link. This blog post will be quite long as this blog post is the documentation for it.
Well,
lets first talk about why I coded this program. So I was working on improving
the user experience of my weather app (weather-witness) by allowing the user to have a drop-down
search whenever they type in a new character into the search field. To do that
I need a list of cities names, and as OpenWeatherMap has a JSON file on the
list of cities name. I took the file, and when I opened it up, the list was
great. Every data was listed correctly. But there's some data that I don't
need, and it'll be annoying to delete them one-by-one. So I decided to make a
simple script to automate the data removing using Nodejs.
The weather app |
Why
Nodejs and why not python, you ask? Well, my answer is, why not? I've been
wanting to create something using Nodejs, and I thought this might be a good chance for it.
The index for this post:
- How does the program work?
- The technical part.
- The main function & getting user input
- Getting file content of the JSON file
- The FinishOperation() function
- Getting JSON data Available key
- Getting user input on which properties to be deleted
- Deleting the selected keys of the properties
- Generating a new file for the filtered JSON data
- Conclusion
How does the program work?
So,
now lets talk about how does the flow of the program look like. When we execute
the script in our terminal, the program will first ask the user to provide it
with the location of the JSON file. Then, the program will check whether the
data is in a valid JSON format. If the file contains some malformed JSON data,
the program will produce an error which tells the user the isn't a valid JSON
file. The program will continue with the next operation if it's a valid JSON
file.
Then,
after getting and checking the JSON file, the program will store it inside a
variable and find all the available key for the JSON data. The program will
list out the key's for every available data level in the file. The key are
stored in an array as strings. Keys that're a descendent of another key would
have "." in between its key name and parent's key name. This also
applies if they're both child key of another parent key, the name for each
level will be separated with a "." character. For example, "ancestor_name.parent_name.child_name".
I'll explain to you its technical part when we arrive there, for now, let's
understand the flow of the program first.
After
the program has found all the key available in the JSON file, the program will
then ask the user which of the keys that he/she wants to remove from the data.
After the user had chosen one or more keys that he/she (separated by
",") intends to remove from the file, the program would ask for
confirmation for the operation first. After the approval is done, the program
will then loop through the keys given by the user and checks whether the keys
are valid or not.
Example of the script running |
Keys selection confirmation |
After
that, the program would filter out the keys from the variable that stored the
JSON data just now. Then, after filtering the data, the program will then
generate a new file for the JSON data. The file will be saved in the user's
desktop directory.
Program finishing its operation by showing the example JSON data and generating a new file to store the data |
The technical part
Now, let's get started with the technical part of the project. First, we'll need to import some libraries for our project. We'll be using a third-party library called "prompts" as well as Nodejs filesystem library, "path", for dealing with reading and writing data to file. We'll also need to import another file that contains part of our project's code called "utility.js". The file contains most of the code that we'll need. For example, the function that let us read files and as well as the function that let us create a new file and function that filter out the specified JSON keys.
Importing libraries |
The main function & getting user input
After that comes to our main function, which is an asynchronous anonymous function, I used a function to wrap the prompts operation in an async function called getUserInput() so that I could use back the code when I need to. Since the function is an asynchronous function, when we call the function, we need to use await to wait for the user input operation to complete before moving forward to the other code.
getUserInput() function to read user input in terminal. |
Our main async anonymous function. |
Getting file content of the JSON file
Getting file content from the location provided by user |
The FinishOperation() function
finishOperation() function, start by getting available keys |
After
validating and parsing the JSON file data, we can then pass the JSON data into
the finishOperation(). In this function, we'll first get the list of the
available key for the JSON data by calling the utils.getAvObjKeys(JSON)
function. The avObjKeys will be used to store the result returned by the
function.
Getting JSON data Available key
getAvObjKeys(), used to get all the available keys in the JSON data. |
Inside our getAvObjKeys() function, we first check whether the JSON data we're using is an object and not a value. If the data is not an object, we'll output an Error message on the terminal to notify the user. While, If the JSON data is in object form, we'll then check if its value is in array form. If the data is in array form, we will then use a forEach loop to iterate through the array. When iterating through the array, we get the keys for the properties inside the array by using Object.key(). Then we'll pass the array's element along with their respective available inner properties keys and an empty avKeys array to the getInnerKey() function.
This also goes almost the same when the JSON data is in object form instead of an array. We'll get all the available keys for the JSON properties, as they're not in array form so we can just straight up to get the available properties key and pass them to the getInnerKey() function.
getInnerKey() function, used to search for inner keys of an Object. |
Now,
in the getInnerKey() function, the function actually have four arguments the
first three arguments are the JSON object while the second one is the current
key that we want to check if the value of the property with the supplied key
contains another object. The third arguments are the array that contains the
previously saved keys name. While for the fourth arguments, it's the string
that contains the key name of the previous level. You'll understand why we need
it when I explain the recursion I used.
We
then need to pass the content of the object with the supplied key into a
variable (reduce the code we need to write later on). If the content of the
property is indeed an object, then we'll need to get all available keys in it
not to mention the keys that are on a lower level. Now, you might be thinking
is it even possible to get all the keys? Well, this is when recursion come in
handy.
Since
we will be dealing with multiple levels of property keys, we need a way to
record the name that's constructed at the previous level. This is why we have a
fourth argument for the getInnerKey() function, "head". When the
content of the head is empty, it means that the JSON object provided is from
the first level of the JSON data. We'll then store it in the nameTemp variable
and push it into the avKeys array if it's not yet in it. The content of the
inner object then will be verified to see if its an array of object/value or is
it an object. If the inner object of the JSON object is an array, we'll then
loop through the elements in the array. After that, we'll get all the available
keys from the array elements and then loop through the available keys.
When
looping through the keys, we passed the key to another function called
generateInnerKey(), where we generate the keys name according to their level.
The arguments that we passed are the array's element JSON properties value, the
current property key's name, the inner key's name of the array's element JSON
value, the array of the available keys for the JSON data and lastly the name of
the previous level key. While the value passed when the inner object is an
object is different than when it's an array. We'll straight up to get all of
its key and loop through the keys. When we loop through the keys, we passed the
key one by one to the generateInnerKey() function. This time the first argument
is the inner object itself instead of the value of its properties. If the inner
value of the current JSON property is not an object, we'll check if the key for
that object is already in the avKeys array or not. If it's not in the array,
push the value into the array and ignore it otherwise.
Generating names as well as use recursion to loop through lower-level keys. |
In
the generateInnerKey () function, we have a total of 5 parameters; theELe,
currEle, innerKey, avKeys, and head. theEle is the current JSON object that we
got from the previous level. while the currEle is the key name of the current
JSON property. The innerKey are the available keys in the current JSON object
while avKeys is the array of all the JSON keys that we have stored from the
previous operation. Lastly, the head parameter is the string that stores the
name from the previous level to construct the current keys name.
In
this function, we'll first construct the name of the key by combining the
previous level property's key name with the current level key name. If the head
parameter is not empty, that means that the current operation is from a lower level.
Therefore, we also need to combine the key name from the previous object level
with the current one's (while ignoring the previous level key name, as we've
already had its key name in the head parameter). After that, we check for whether the constructed
key name is already present in the avKeys array and push the newly created key
name into it if it's not yet present in the array.
After
all of that, we now need to check again to see whether the value of the inner
object is an object. If the value turns out to be an object (array also count
as object), we need to call again the getInnerKey() function again (recursion).
When all the available keys are found, the keys will be returned to the first
getInnerKey function call. Then, the avKeys array will then be returned back to
the finishOperation() function.
Getting user input on which properties to be deleted
Now,
we'll need to print out the available keys for the user to filter onto the
terminal screen as well as asking the user which of the keys that he/she want
to filter. But before that, we'll need to print out the available keys to the
users so that he'll know what's the available options. Since the user might
have entered some invalid keys name, we need to use a loop to repeat the
operation when that happens. After getting the keys from the user, we then need
to ask for confirmation from the user whether the options they provided are the
ones that they want.
Going back to the finishOperation() function, we now need to print out the available options. |
Ask confirmation for the options selected by user. |
If
the inputted value's first character is "y", then we will check the
inputted options value by the user. We then need to create an array of options
(in string form) from the value by splitting the string using comma (,) as the
separator. Then trim() the strings in the array produced by using map() to
remove the leading and trailing whitespace. After we have generated the array
of options provided by the user, we then can pass the array into a function
called checkOptions() to check if the options offered by the user is valid.
If the user have confirmed it, we continue by checking the value provided. |
The
checkOptions() function takes in two arguments, the array of options generated
from user input and the array of all the available keys for the JSON data. This
function basically just loop through the options array and search for the
string inside all the available keys. If the function failed to search for the
string in the array, then the function will return false back to the search
back to the finishOperation() function. But if when the options array has
completed looping through its element, and the function has not returned the
false value to its caller, it means that the options provided by the user are
indeed valid. The function will then return the value true to its caller, finishOperation().
checkOptions() function to check the options provided by the user with the available keys. |
If
the options are valid, then we can continue with the data filtering and new
file generation section. To filter the selected data from the JSON data, we need to call the filterJSON() function.
This function will return a newly filtered JSON data back to the
finishOperation() function. We'll provide the function call with two arguments,
the JSON data and the selected options offered by the user. In this function,
we'll first check if the JSON data contained in the first level is in Array
form. If its an array we need to loop
through the array while looping through the options. When looping through the
options array, we need to determine whether the keys include a dot (.) symbol
in its string. For those that contain the sign, it means that these properties
are at a lower level. When that happens, we will call a function called
filterInnerData() to deal with the properties that are at lower levels. For the
option string does not have the dot (.) sign in it, then we can just delete the
property by using delete json_Object[key_name].
Checking the validity of the options, show error and break the loop if an error occurred. |
Deleting the selected keys of the properties
filterJSON() function to remove the unwanted properties from the JSON data |
If
the content of the JSON data at the first level is not an array but an object,
we need to loop through the options array and check which options have the dot
(.) sign in it. The options that contain the dot sign (.) will trigger the
filterInnerData(). Then, for the options that do not contain the dot sign will
straight up use delete json[ele] to delete the JSON property from the JSON
data.
While
in the filterInnerData() function, we take in two arguments, the JSON object
(parent) and the key that contains the dot sign. In this function, we will also
be using recursion to filter out the desired properties completely. First, we
will split the key names using the dot sign as a separator. Then, we store the length
of the array into the variable arrLen for future use. Then we referenced the
parent object to a variable called obj. This "obj" object will be
used to stack up the object that we will be removed from the JSON data
(referencing object dynamically).
Our filterInnerData(), used to filtering properties at the lower level. |
Add caption |
After
that, we need to determine whether the content of the property is an object or
a value. I used tempKey[0] as the
key for the property because it's the first level key for the object. We'll
remove it when we use recursion to call back this function so don't worry much
about it. If its an object type value (object/array) then we now need to
determine if its an array or an object. If the content of the property is an
array, we need to loop through the value of the property. Then we'll call the
filterInnerData() again while providing the value of the property as well as
the key name (as its not the last level of the option provided). We'll also
need to cut off the first section of the key string before we pass the key name
to the filterInnerData() function again.
If the content is an object type value. |
If
the provided value of the property is an object, then we need to have a way to
refer to the object that we want. To do this, we need to stack up the
references using a for a loop. In the loop, we assigned the ele variable with the
key name according to the level. Then we'll use the property according to the
key provided in each loop. In every loop, if the value of a property is an
array, we need to loop through the elements in it and call the function
filterInnerData() to filter out the lower level's property. When calling the
function, we'll also need to cut off the used key section from the array and
pass a string constructed from the remaining section while using the dot sign as the combiner. When the object
references is stacked up, we then can delete the property using the last part
of the key section array (tempKey) as the key name.
If the content is a object typed value and not an array. |
If
in turn, the value of the property is not an object, its means that we're
already at the last section of our destinated location. Then we can just delete
the property using the key and JSON object provided in the parameter.
If the content is a value (number, string, character, etc). |
Generating a new file for the filtered JSON data
These
processes repeated until the JSON data is entirely filtered by the options
chosen by the user. When the filtering process is completed (the filtered data
return the JSON data to the finishOperation() function back), we then can
generate a new file to store the JSON data. Before we generate a new file for
the data, it might be wise for us to print out a sample of the filtered JSON
data so that user can determine if they've correctly filtered out the unwanted
properties.
After filtering the data, and new JSON data returned to finishOperation(). |
Now to generate a new fie, we'll call a function called generateJsonFile() while providing two arguments to the function; the JSON data and the basename of the file that we read to get the content of the JSON data. In the function, we will create a file in user's desktop directory with the basename of the file we used to read the data from while adding the text "filtered_" in front of the filename.
The generateJsonFile() function. |
Conclusion
- Users could delete properties at a certain index of the array in a JSON file.
- Better user interface.
- options for users to save the file at a specific location instead of their desktop.
No comments:
Post a Comment