Wednesday, August 21, 2019

PSF GSoC students blogs: Blog #6

In the past week, my mentor and I tried to fix the dockerfile that sets up hadoop in a ubuntu container from scratch. Since that was becoming tidious, we tried setting up a mini hadoop cluster.

Apache has this mini mini hadoop cluster set up that gives a single node cluster. I tried building this using a maven docker image. The documentation has very little information on where hadoop is actually getting downloaded and the ports it'll be connecting to by default. My mentor and I debugged the dockerfile and tried to get this up and running but still there is a problem with ports and I'm working on it. Also, we figured out how to get the files from hdfs which can be either CSV or JSON type of files. I have implemented those changes as well.

Hopefully by next week I can finish this project.



from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...