Recently I have attended a two-day workshop on MongoDB under the mentorship of Mr.Vimal Daga Sir!

Key Takeaways:

  • Data models define how a DB’s structure is designed and also how the data is connected/processed/stored within a system
  • SQL or structured query language is a type of structured storage with fixed fields. and NoSQL is a DB that isn’t tabular and can store different forms of data than the usual relational DB tables
  • The types of NoSQL databases are:
  • documents, key-value pairs, wide-column , and graph
  • Advantages of NoSQL :

1. Flexible for data models

2. Consists of Horizontal scaling

3. Can run fast queries

  • Understood the fact that MS Excel is also a program used to save and analyze numerical data.
  • understood the fact that insert operations will add documents into a collection
  • Schema-less DB are the ones that don’t have any forms of restrictions in their structure for the data
  • So they can store documents in any DB and can consist of various sets of fields with different types per each field
  • JavaScript Object Notation or JSON for short, is a lightweight data-interchange format which is coincidentally easy to read and write
  • CURD is the standard operations performed on a table: Create Update Read Delete
  • MongoDB consists of various methods for performing such operations like for eg:
  • insert(), find(), findOne(), update(), delete()
  • Compass is the GUI version of MongoDB, using this we can perform data exploration, run queries, optimize query performance, and so on~
  • for using it we have to provide the URL of our mongo server to the Compass application
  • mongoimport is the tool that lets us upload data into a collection in a DB
  • MongoDB consists of multiple suitable drivers for integrating into our application database
  • PyMongo is the most recommended MongoDB library for integrating MongoDB with python applications
  • Understood the fact that indexes are the fastest way to retrieve or search any data in a document within the database since it improves its I/O performance
  • Primary key is a unique key meant for identifying a document within a given collection
  • Index is the one that reduces the search time of a doc in a collection as in they store a small portion of the collection’s dataset for an easier traversal
  • Sharding is the process in which it distributes the data across many nodes, the merit of this is the efficiency of a query in this is a lot, and can also avoid SPoF
  • Replica Set is used for improving redundancy and increase availability, by replication provides a level of fault-tolerance against data loss since one or the other will be up during the fall of a server
  • COLLSCAN is a plan which implies that a collection scan is performed so the server scans a collection doc by doc to obtain the results
  • IXSCAN is for index keys to be scanned for getting a result
  • understood the fact that a single index structure can hold references of multiple fields in a collection’s docs and this is called a Compound Index
  • The aggregation pipeline is a framework for data aggregation of the documents, we enter a multi-stage pipeline that transforms the documents into an aggregated results
  • Mongo router is a prog that maintains nodes in a master-slave relation network within the database cluster
  • there’s multi-cloud database service of MongoDB available on cloud services called MongoDB Atlas

I’d like to thank Mr.Vimal Daga Sir, Preeti Mam, and the LinuxWorld Informatics Pvt Ltd team for organizing such a great session.

Thank you for the time~