Wednesday, April 21, 2021

Documentation!

 Recently I've been working with a fork of the Open edx platform (https://github.com/edx/edx-platform). My coworker and I were tasked with changing the built in player which accepts a mp4 link (youtube or s3), into one that allows users to upload a file to the platform that to be indexed into the user's s3 bucket and then update the player with the uploaded video. Sounds fun right? I assure you, it is not.


I have no idea how big the code base is. It is massive and contrary to the implementation docs, the development documentation are hugely outdated. After struggling with the code, my coworker and I were able to complete the clients request and implemented the customized workflow. 


Then came the next request, analytics: Which users of the platform completed which modules? How long did a user spend watching a video? How many users enrolled in a specific course this week? We proposed two solutions, one was a hugely outdated, but easy to implement analytics providing a high-level overview of users of the platform. Unfortunately the library wasn't updated to support the newest release of Open edX platform we were using (Koa.2) and worst of all, it didn't provide the granularity the client was asking for. 


Great. Second solution: Use Open edX's own data analytics solution, this alternative is infamous for clogging up the Open edX's forum with requests to help debug the installation/implementation of it (including yours truly) and forcing many people to instead use the first solution. Just check out how amazing this document is teaching you to implement this, enjoy the diagram. 


I finally found some extremely old documentation written by the chief architect of edX. Check out the documentation for it. This was created in 2015 and has been barely updated since, one of the steps even tells you to use a Hadoop 2.3 when the repo has already migrated to use Hadoop 2.7. Out of all the instructions, I can personally tell you I had an issue with literally every step that required me to hack at it, there is no FAQ and the slack channel + forums are so quiet, you'd be ecstatic to even get a response. If it is one thing OSD has taught me, it was to research and adapt and maybe there might be a light at the end of it all. Here's a list of topics I had to research about to almost finally get this working:

  • Hadoop
  • Hive
  • Ansible
  • Django
  • Tutor Open edX
  • Virtual env
  • AWS
  • MySQL
  • Oauth2
  • Luigi
All this because, documentation is not up to date. People who work on Telescope, be thankful the slack channel is extremely active and the documentation is up to date.

Contains Duplicate (Leetcode)

I wrote a post  roughly 2/3 years ago regarding data structures and algorithms. I thought I'd follow up with some questions I'd come...