Migrating and Importing CSV files to MongoDB database in Spring Boot applications

In this blog, we will see how to integrate data migration into MongoDB database in Spring Boot applications. Also we will check how to import a CSV file into the DB. 

For completing this tutorial/blog, you need to have a basic understanding of the Spring Boot and MongoDb. Also you need to have Docker installed in your system because without installing the MongoDB database server in our system, we will use a MongoDB docker compose file to run the database. 

Let's first generate a simple Spring Boot application.

Initial Project Generation

Go to Spring Initializr then select Maven Project under Project section, Java in Language section, 2.6.6 in Spring Boot section. 

In the Project Metadata section, mention 

Group: com.morshed

Artifact: mongo-migration

Name: mongo-migration

Description: A simple spring boot project on data migration into MongoDB


In the dependencies section, add these dependencies:

  • Spring Web
  • Spring Data MongoDB
  • Lombok
  • Spring Boot DevTools
Then click on the Generate button for downloading the project. Extract the zip file and open the project in your favorite IDE.




MongoDB Docker Compose File

Create a folder named docker under src/main, so the updated directory structure is src/main/docker. Then create a file named mongodb.yml in the folder. Add the following snippets in the file. 

version: '2'
services:
  morshed-mongodb:
    image: mongo:4.2.7
    ports:
      - '27022:27017'
    volumes:
      - ~/volumes/jhipster/morshed/mongodb/:/data/db/

Here, MongoDb database will run in port 27017 in the docker container, but we are exposing the in our host maching at port 27022. 

To run the compose file, go to the directory in a terminal (in the mongodb.yml file location) and then run the following command to start the database.
docker-compose mongodb.yml up
Now our database is up and running. Next we are going to configure the mongodb database in the project.

Configure MongoDB database

Add the following properties in application.properties file. 

spring.data.mongodb.host=localhost
spring.data.mongodb.port=27022
spring.data.mongodb.database=morshed-blog
  
Now if we run the project, we will see that our project will run in 8080 port. Before we test anything, let's add a basic model. 

Add a Document Model

Let's add a model named Division. 

  package com.morshed.mongomigration;

import lombok.*;
import org.springframework.data.annotation.Id;
import org.springframework.data.mongodb.core.mapping.Document;

@Document
@NoArgsConstructor
@Getter
@Setter
@Builder
public class Division {
    @Id
    private String id;
    private String name;
    private String bnName;
    private String url;
}

Here, 

@Document annotation refers the model as a document. It will create a Document named Division

@NoArgsConstructor is a Lombok annotation and it will create a no-argument constructor.

@Getter is a Lombok annotation and it will create getters. So we don't need to generate getters.

@Setter is a Lombok annotation and it will create setters. So we don't need to generate setters.

So our model or entity is ready. Let's add a repository for our Division model in the same package.


package com.morshed.mongomigration;

import org.springframework.data.mongodb.repository.MongoRepository;

public interface DivisionRepository extends MongoRepository<Division, String> {
}

Now we will add migration feature in our project. 

Add Mongock Dependency

Mongock is a java based migration tool. It helps us define our migrations using only Java. 

Add the following dependency in the pom.xml file.

		<dependency>
			<groupId>com.github.cloudyrock.mongock</groupId>
			<artifactId>mongock-spring-v5</artifactId>
			<version>4.3.8</version>
		</dependency>
		<dependency>
			<groupId>org.mongodb</groupId>
			<artifactId>mongo-java-driver</artifactId>
			<version>2.14.0</version>
		</dependency>
		<dependency>
			<groupId>com.github.cloudyrock.mongock</groupId>
			<artifactId>mongodb-springdata-v3-driver</artifactId>
			<version>4.3.8</version>
		</dependency>

We are going to declare our migration classes in our default project package, i.e. in package com.morshed.mongomigration. We need to mention this in our application.properties file. So when spring boot application will run, mongock will check the package for migration files.

We need to add the following property. 

mongock.change-logs-scan-package=com.morshed.mongomigration

Now let's create our first migration class. 


package com.morshed.mongomigration;

import com.github.cloudyrock.mongock.ChangeLog;
import com.github.cloudyrock.mongock.driver.mongodb.springdata.v3.decorator.impl.MongockTemplate;

@ChangeLog(order = "00001")
public class DbChangelog00001 {
    @ChangeSet(order = "001", id="initialDivisionData", author = "Monjur-E-Morshed")
    public void setInitialData(MongockTemplate mongockTemplate){
        Division division = Division.builder()
                .name("Division")
                .bnName("BanglaDivis")
                .url("www.division.com")
                .build();
        mongockTemplate.insert(division);
    }
}

Here we have created a migration class named DbChangelog0001. Here we are doing some things which are needed to be explained. 

@ChangeLog(order="00001") annotation is used at the class level and it references the class as changelog and the order means the order in which the changelogs will be executed. It's a good practice to name the class with the changelog order number. 

@ChangeSet annotation is used at the method level. It has three parameters, order refers to the order of the Changelog in which the methods will be executed. id is for referring an ID which can be provided by you and there is no any specific rules in setting the id, just the ID must be unique inside the Changelog. Then author refers the author who is generating the changelog.

Each method must have a parameter of type MongockTemplate which comes from mongodb-springdata-v3-driver dependency. The parameter is used to save the data to the database as seen inside the method we are using mongockTemplage.insert(division) for storing the database.

Before running the application, let's write a simple resource which will return us the all Division data in a rest endpoint.


package com.morshed.mongomigration;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.List;

@RestController("/")
public class DivisionResource {
    private final DivisionRepository divisionRepository;

    public DivisionResource(DivisionRepository divisionRepository) {
        this.divisionRepository = divisionRepository;
    }

    @GetMapping
    public List<Division> getAllDivisions(){
        return divisionRepository.findAll();
    }
}


The resouce is just return the list of divisions if we go to http://localhost:8080/. Our Mongock is not configured yet. We need to add @EnableMongock in our main class, i.e. in MongoMigrationApplication class. Let's add the annotation.


package com.morshed.mongomigration;

import com.github.cloudyrock.spring.v5.EnableMongock;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
@EnableMongock
public class MongoMigrationApplication {

	public static void main(String[] args) {
		SpringApplication.run(MongoMigrationApplication.class, args);
	}

}

Now let's run the application. Then if we go to http://localhost:8080/ then we will see the following results.


So, our Mongock migration is working. Now let's import a csv file.

Importing CSV files

A a csv file named divisions.csv under resource/initial-data and add the following data.

1,Chattagram,চট্টগ্রাম,www.chittagongdiv.gov.bd
2,Rajshahi,রাজশাহী,www.rajshahidiv.gov.bd
3,Khulna,খুলনা,www.khulnadiv.gov.bd
4,Barisal,বরিশাল,www.barisaldiv.gov.bd
5,Sylhet,সিলেট,www.sylhetdiv.gov.bd
6,Dhaka,ঢাকা,www.dhakadiv.gov.bd
7,Rangpur,রংপুর,www.rangpurdiv.gov.bd
8,Mymensingh,ময়মনসিংহ,www.mymensinghdiv.gov.bd


Now add another migration class named DbChangelog0002.java under our default java package.


package com.morshed.mongomigration;

import com.github.cloudyrock.mongock.ChangeLog;
import com.github.cloudyrock.mongock.ChangeSet;
import com.github.cloudyrock.mongock.driver.mongodb.springdata.v3.decorator.impl.MongockTemplate;
import org.springframework.core.io.ClassPathResource;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

@ChangeLog(order = "00002")
public class DbChangelog00002 {
    @ChangeSet(order = "001", id="divisionData", author = "Monjur-E-Morshed")
    public void importDivisionCSVData(MongockTemplate mongockTemplate) throws IOException {
        File divisionFile = new ClassPathResource("initial-data/divisions.csv").getFile();
        BufferedReader br = new BufferedReader(new FileReader(divisionFile.getPath()));
        String line=null;
        while ((line = br.readLine()) != null) {
            List<String> objects = new ArrayList<String>(Arrays.asList(line.split(",")));
            Division division = new Division();
            division.setId(objects.get(0));
            division.setName(objects.get(1));
            division.setBnName(objects.get(2));
            division.setUrl(objects.get(3));
            mongockTemplate.insert(division);
        }
    }
}

If we re-run our application again, then the data of the csv file will be imported through the above changelog file. If we visit localhost:8080/ again, then we will see the response ad below.

So our csv data import is now complete. In the same way, any file types (xml, json) can be imported.

You may check the git repository: monjurmorshed793/mongo-migration-spring (github.com)

Thanks for reading the blog. Any comments or suggestions are welcome.





Comments

Popular posts from this blog

Spring Boot and Angular Integration