Friday, March 10, 2023

Contains Duplicate (Leetcode)

I wrote a post roughly 2/3 years ago regarding data structures and algorithms. I thought I'd follow up with some questions I'd come across such as the famous Blind 75 Leetcode Questions and write down what my thought process to approaching the problem + run time and space complexity of the solution for the given question.

I usually solve all my questions in Python. The first of which questions we will work on is Contains Duplicate:


Regardless of the approach, we'll have to check every number in "nums" so our run time will be O(length of nums) at a minimum. We'll also assume within nums there is only one possible duplicate.

What's the most intuitive solution when we read through this? Most likely checking the current number n against all other numbers (n+1, n+2, n+3...).

We'll use the first example: we're on the first element, 1. We'll have to check 1 against 2, 1 against 3, 1 against 1. Then we'll check 2 against 3, 2 against 1, we don't have to check 2 against the initial number because we have already made that comparison before.

So we'll come up with something like the below:


Code is pretty simple, we use a for loop to iterate through nums and within that loop, we'll do another for loop to check elements of n+1 until we reach n + length(nums) - 1. This results in a O(N^2) solution because for each iteration of nums we're roughly going through it length(nums) - index times. The space complexity of this is O(1) since no extra structure is used to store anything.

Is there anything we can do to reduce the number of times we have to iterate through the list? We can perhaps use a dictionary, a key/value store. We will use the number as the key and a value depending on how many times we have come across the number. Using Example 1:
  1. first iteration, the value of element at iteration 0 is 1, does the key 1 exist in the dictionary? No, we'll add it and set the value to 1
  2. second iteration, the value of element at iteration 1 is 2, does the key 2 exist in the dictionary? No, we'll add it and set the value to 1
and so on until we hit iteration 4, does the key 1 exist in the dictionary? Yes! we'll return True since the key exists and this means we have found a duplicate. In another scenario if we iterate through the whole list and no duplicates are found we'll return False.


What is the complexity of the above code? O(N) time-complexity since we're only iterating through the array once (we have to create a dictionary from nums). O(N) space-complexity since we'll need to store values up to length nums in the variable my_dict.

Is there a more concise way to do this? Actually there is, we can use a structure called Sets, a set in Python is similar to a list (array) except it does not allow duplicate values.

Given a set cannot contain duplicate elements, we can compare it to the original list, if both are the same we return False, otherwise if they're not the same that must mean the nums must contain a duplicate. We arrive at something like below:


What we're doing in this case is comparing the length of nums against the length of nums turned into a set. If there are no duplicates both lengths should be equal and we'll get a value of 0 (False). Otherwise we'll get a value of 1 (True) since the length of the set nums cannot contain any duplicate values the relation between the two lengths would be something like this: length of nums >= length of set(nums).

What is the complexity of the above code? Same as the dictionary solution, as we'll only iterate through nums once and we'll have to create a set of length nums (roughly) to store values up to length nums.

Wednesday, April 21, 2021

Documentation!

 Recently I've been working with a fork of the Open edx platform (https://github.com/edx/edx-platform). My coworker and I were tasked with changing the built in player which accepts a mp4 link (youtube or s3), into one that allows users to upload a file to the platform that to be indexed into the user's s3 bucket and then update the player with the uploaded video. Sounds fun right? I assure you, it is not.


I have no idea how big the code base is. It is massive and contrary to the implementation docs, the development documentation are hugely outdated. After struggling with the code, my coworker and I were able to complete the clients request and implemented the customized workflow. 


Then came the next request, analytics: Which users of the platform completed which modules? How long did a user spend watching a video? How many users enrolled in a specific course this week? We proposed two solutions, one was a hugely outdated, but easy to implement analytics providing a high-level overview of users of the platform. Unfortunately the library wasn't updated to support the newest release of Open edX platform we were using (Koa.2) and worst of all, it didn't provide the granularity the client was asking for. 


Great. Second solution: Use Open edX's own data analytics solution, this alternative is infamous for clogging up the Open edX's forum with requests to help debug the installation/implementation of it (including yours truly) and forcing many people to instead use the first solution. Just check out how amazing this document is teaching you to implement this, enjoy the diagram. 


I finally found some extremely old documentation written by the chief architect of edX. Check out the documentation for it. This was created in 2015 and has been barely updated since, one of the steps even tells you to use a Hadoop 2.3 when the repo has already migrated to use Hadoop 2.7. Out of all the instructions, I can personally tell you I had an issue with literally every step that required me to hack at it, there is no FAQ and the slack channel + forums are so quiet, you'd be ecstatic to even get a response. If it is one thing OSD has taught me, it was to research and adapt and maybe there might be a light at the end of it all. Here's a list of topics I had to research about to almost finally get this working:

  • Hadoop
  • Hive
  • Ansible
  • Django
  • Tutor Open edX
  • Virtual env
  • AWS
  • MySQL
  • Oauth2
  • Luigi
All this because, documentation is not up to date. People who work on Telescope, be thankful the slack channel is extremely active and the documentation is up to date.

Thursday, October 8, 2020

Fun with Docker

I'm starting a new project for work soon. Instead of developing from the ground up, we'll be using an existing open-source solution and customizing it to suit the project's need. I figured this is going to be more devops and less coding as we'll mostly be doing modifications on the existing code and decided to play around with Docker since I don't have much experience writing Dockerfiles or docker-compose files.

I took one of my side projects I created with a node.js backend and react frontend and decided to "dockerize" it. I decided to start with the front-end for now since the backend requires Redis.

# Use the official image as a parent image
FROM node:lts-alpine

# Set the working directory.
WORKDIR "/adventure-capitalist"

#Copy the file from host to current location
COPY package.json .

# Run the command inside your iamge filesyste.
RUN npm install

# copy the rest of app's source code from host to image files
COPY . .

# Add the metadata to the image to describe which port the container is listening to 
EXPOSE 3000

# RUN cd /adventure-capitalist/src/backend/data && node app.js

# Run the specified command within the container.
CMD [ "npm", "start" ]

After writing this file, I built it using docker build --tag adventure-capitalist:1.0 .

Docker builds the image and after building it, I run it in a container using: docker run -p 8000:3000 --name ac adventure-capitalist:1.0 


This should do it right? I'll go to localhost:8000 and should see my front-end right? NOPE I logged the container with the command docker logs ac, and didn't see any issues. Decided to ask @manekenpix for a second pair of eyes + google "dockerizing a react app" and found out why it wasn't working . I needed to add the -it flag otherwise it won't work:


So the full command I had to use was: docker run -it -p 8000:3000 --name ac adventure-capitalist:1.0


References:

https://mherman.org/blog/dockerizing-a-react-app/

Friday, September 25, 2020

Padding your Github stats

These past few days I got to have some fun reviewing github repos in @humphd's OSD600 class (I made sure I didn't take all the issues). I primarily focused on Python and Javascript/Node.js repos as I'm most comfortable with them. It was actually really fun as I got to suggest some things they did not think about or things I learned while attending OSD600/OSD700 from a year ago. I must say though, these students are way better than when I first started OSD600... Extremely looking forward to their contributions for Hacktoberfest and even more so if they decide to contribute to Telescope

Monday, September 21, 2020

Pre-September

 Over the summer I was employed as a research assistant for Seneca on an NLP project. Now that we're wrapping up and with October coming around I decided to play around with Golang and look for some projects to work on aside from slaving away contributing to Telescope. My previous professor linked the repo for the backend of the Canadian COVID app and I found the perfect issue. If successful, I'll have contributed to something used across Canada...?




Friday, June 5, 2020

Attempt at Creating a Clone of Adventure Capitalist

After about 3 weeks working on this project, I'm kind of done. Built with React/Node/Redis/SocketIO I learned a lot. The reason I say I'm kind of finished is because unless I overhaul the whole backend of the code, I don't think I can get it working 100%. ... I know it looks awful haha.


The project was fun and challenging, there weren't any guidelines on how the project should be built aside from it being written in Javascript/Typescript. I initially tried using Typescript, but it gave me headaches with the difference in import exporting. This is something I'll probably need to learn more about.

You can check out a version of a working game here (not mine). The hardest part about creating this clone is regarding hiring a manager. Hiring a manager takes care of clicking for one of these shops. The number on the right shows the "cooldown" of the button before it can be pressed again. A manager will auto this process so whenever the shop is off cooldown, it should be clicked, the timer is actually reflects how much cooldown time is left, there should also be a progress bar providing a visual representation.

There were 2 issues to think about:

  1. The initial max cooldown of the Lemonade shop is 500ms, when a player purchases a certain amount of the shop, the max cooldown is actually halved or and it becomes exponential with each threshold reached. So potentially this shop could be firing a request every 7.8ms (500 / 64), what tool should I use for this?
  2. How should I manage the auto clicking by managers?
  3. The managed shops should also be running even if the window isn't open, players should be informed of how much cash they have earned while the window was closed
I looked around a bit and decided to use websockets, specifically Socket.IO. I thought using the traditional HTTP/GET request would destroy the backend since there could be a ton of requests being sent.

The second issue I kept thinking of was how to create the auto function for managing a shop + keeping track of how much time was left AND having all this be reflected on the front-end. After thinking about this for a few days and getting nowhere, I reached out to @humphd who suggested using the TTL(time to live) + Pub/Sub functionalities of Redis. This was pretty cool as it had me researching about keyspace notifications for Redis. That's all for now... I may blog more about this later.

Wednesday, May 6, 2020

Typescript + Linters

Taking a small break from Telescope until the summer semester resumes. I've started collaborating with a elementary school friend on a project to build a clone of the game Adventure Capitalist. After working with Javascript for so long, I decided to try doing this in Typescript. It went pretty well up until I had the following line of code:

const index = this.shops.findIndex((shop: Shop) => shop.name == shopName);

When I was trying to compile my code, I kept getting the following error

Property 'findIndex' does not exist on type 'Shop[]' 

Pretty sure this should work as shops is an array of type Shop. As a developer usually does when they run into issues, I started googling the problem and checking Stack Overflow. It recommended I change my tsconfig.json "target" to es2015 and the findIndex() is an es6 function and add es6 to "lib". I did all that and tried compiling, still no good. I reached out to my frequent collaborator from Telescope @manekenpix and he suggested I just try running the code. It works?

Turns out it was a linter issue, although it still compiled properly. Upon further research 2 hours later, I realized I was using the cli command wrong, or at least the way I was using it was going to cause errors. I was compiling my .ts to .js by using the command tsc index.ts instead of tsc, when a specific file name is used, it will disregard the tsconfig.json file settings and just try to compile your Typescript to Javascript. So I tried running 'tsc', it worked! No errors and it was outputting all the compiled .js files inside my /build folder (ignored in .gitignore) I specified in my tsconfig.json file.

Thursday, April 30, 2020

Data Structures and Algorithms

I'm finally done all my courses and since the job market isn't that great right now, I have taken a different approach. Instead of working personal projects or contributing to open source, I've decided to brush up on data structures and algorithms for a bit.

One thing I found lacking for Seneca's Computer Science related programs was the science portion of Computer Science, maybe it was because I was enrolled in CPA and not their BSD program.

For CPA the only course that deals with data structures and algorithms, DSA555 is offered as a professional option. After taking the course I noticed why, as a pretty smart person in the class said "If this was a mandatory course, a lot of people would be dropping out of the program, it was pretty hard". I still wish there was another similar course or two offered so we could learn more about analyzing the run times of more complex functions and graphs.

I took DSA555 last winter and have more or less forgot how to implement or how most of the things I learned in the class work: linked lists, trees, different types of searches + sorts. So now as I am typing this blog, I am solving and looking at problems on leetcode.

A friend of mine currently works for Tesla and is looking for a new job. Most of the places he's been interviewing at for a full stack position have also asked him data structure and algorithm questions on top of questions involving joining two tables in SQL or how to mock a request for testing.

I think this is fair as it makes a developer conscious of the code they write and makes it easier to recognize patterns and respond accordingly.

For example, say I have an array of sorted numbers and I have to find if a given number exists:

I could loop through the array and check each element
Or
I could check if my given number is the middle element in the array. Depending on if it is bigger or smaller I can use the upper or lower half of the array and repeat the same steps, until the number is found or not found.

The second option sounds tedious, but depending on the size of the array, it may actually turn out to be faster than the initial option.

It also allows developers to think about the function they are writing performance-wise. Is it a O(n) solution? O(n^2) or worse O(n^3)? If it is the latter two, can I improve the run time of it? For personal projects this may not matter as much, but if you are working on software or systems that will be used by millions of people or contains a ton of data, these little things start to add up!

Thursday, April 23, 2020

Tuesday, April 14, 2020

OSD700 Release 1.0

We've finally hit 1.0 for Telescope. What a journey.

I finished up the issues I was still working on from 0.9 and worked on a few more:

PR #919: Feed Type Should Support Delete, Modify (Delete + Create)
Overall, I did not expect the whole "add feed" feature to be this big. We had 4-5 people working on this. 3 on the back-end, 1 on the front-end and our prof helping out on both ends. Happy to say we have it working. My PR was working as expected, but it wasn't doing great performance-wise due to multiple Promise.all() and awaits, with help from our prof, we were able to get rid of a lot of them. I learned if you want to trigger our prof, wrap a Promise.all() within another Promise.all().

PR #931: Add a Way to Receive Updates when New Posts are Available
PR is done, I'm beginning to regret mentioning I wanted to work with React.

PR #937: Finish Search Feature
This PR was possible because I learned a few things from reviewing a PR by @Silvyre. Seriously, if you want to be a better developer, look into doing code reviews, you'll widen your perspective.

PR #989: Teach tools/autodeployment/server.js about a release to master
Our staging box auto deploys itself whenever there is a change in the master branch. We wanted to change it up for production so it will only auto deploy itself whenever there is a tagged release. One of the things I have neglected during my years at Seneca is scripting or just using linux commands. I've started on the path of slowly redeeming myself with this PR.

PR #993: Remove "feed added successfully" after some period
Front end change, this uses the SnackBar component implemented in #931 to display a toast informing the user if their feed has been added

Wrapping Up
I started this journey in OSD600 where our first assignment was to create a notepad app (I did the bare minimum) and trying to enhance or debug other students' implementation of it on Github, I even remember I had issues writing the notepad app. Back then, I could not imagine contributing to building a project from the ground up (Telescope)... I could barely fix an issue a week to keep up with Hacktoberfest during that time.

Now that OSD700 is about to wrap-up, I've noticed how much my attitude and skill changed:

We have to implement a new tool for Telescope? No problem, time for some experimenting.
There's a new component that needs to be built? Front-end huh... but I'm game.
New issues dealing with a tool that has been implemented but I haven't used yet? Pick me!
Nginx issues? Oh god. where's @manekenpix to hold my hand through this?

Overall this course and project was extremely fun. We worked like a team you would find in a workplace, we had dev-ops, back-end devs, front-end devs. We got to explore and experiment with different tools we normally wouldn't get a chance to work with all at once: ElasticSearch, Redis, Kubernetes(lol), Gatsby, GraphQL, Nginx, Docker, Jest, SSO. It did not matter if we were not able to implement them in the project. Just the opportunity to experiment with it has improved my knowledge with it, which I am very grateful for.

I think why these open source classes were so fun was because all the things we were doing were open ended. These were real world issues, there is no answer guide for us to take a look at when we were stuck. Sometimes even our prof was reading API or tool docs to understand and help us out when we were stuck on our PRs.

Lastly, I do not think this would have been possible without our prof and lead @humphd, he's a monster. Deployment, back-end, front-end, authentication, system + architecture design, he is knowledgeable in it all. He can and will request changes on your PR which you spent hours working on. As of writing, we have ~468 closed PRs, of which he has probably reviewed close to that same number of them. Thank you for guiding us.

In no particular order it was a blast working with you guys. Thank you @manekenpix, @cindyledev, @Silvyre, @agarcia-caicedo, @grommers00, @miggs125, @lozinska, @yatsenko-julia

Other students at Seneca, if you have a chance to take OSD600/OSD700. Please do. This is probably the highlight of your program.

Monday, April 6, 2020

OSD700 Release 0.9

This post serves two purposes. I was told to blog and this is to test PR #931 for Telescope.

I'd like to give a big shout out to frequent collaborator and contributor @manekenpix for always helping me out. Be it reviews, collaborating on issues or help testing. He has made contributing to Telescope a lot easier than it should've been.

Here's the PR list I worked on for 0.9 and will continue to work on for 1.0

PR #937: Finish Search Feature
This is a PR in progress

The search feature we have right now works, but it isn't polished. We currently only support one search filter, authors. Searching takes a while, but there's no indication of whether the results are loading or not. I added a spinner to fix this. I have to also have to add an endpoint we have to search through Post content and return post results based on the data we get back.

PR #931: Add a Way to Receive Updates When New Posts are Available
This is a PR in progress

This is pretty close to finishing. I added a timer that will fire off a fetch request to get the number of posts in Telescope. If there is a change, a non-intrusive alert should pop up for 6 seconds letting the user know there are new posts available.

This PR also led @manekenpix and I on a 3 hour wild goose chase to track down why we weren't able to get one of the custom headers we had for Telescope. This resulted in the following PR #934, that's right, if you check the PR it took us 3 hours to add/change 4 lines. It was fun, but so so frustrating.

PR #919: Feed Type Should Support Delete, Modify (Delete + Create)
When I get frustrated with the two PRs above, I like to work on this because the backend doesn't lie. I don't have to deal with stuff rendering or not rendering on my screen or go reading up on new hooks. I pray if I ever have to do front-end development in the workplace I'll only have to use UseEffect and UseState hooks.

Anyways, this PR provides the functionality of removing a Feed and having it also remove associated posts in Redis + ElasticSearch. This PR was enjoyable because it taught me the advantages of writing lots of tests. The function works, but I just need  to finish writing a test for it and it should be ready to be reviewed by our gatekeeper @humphd.

Other PRs I worked on were refactoring the layout.js component we had previously using class components to become functional components. Refactoring this didn't take so long and at the end of it, I realized I got pretty good at using the useEffect and useState hooks of React.

As we're nearing 1.0 for Telescope and with so many things left to finish prior to shipping. I present the following as nobody in the Telescope channel has started panicking yet.



Contains Duplicate (Leetcode)

I wrote a post  roughly 2/3 years ago regarding data structures and algorithms. I thought I'd follow up with some questions I'd come...