Search

Total Pageviews

Categories

Linux (8) Windows (5) Container (4) Development (4) Database (3) Docker (3) Mac OS (3) QuickStart (3) Ubuntu (3) AWS (1) Fedora (1) FreeBSD (1) Git (1) Java (1) MyBatis (1) Oracle Linux (1) Spring Boot (1) Unix (1) VMware (1) Virtualization (1) kdevtmpfsi (1) kinsing (1)

Featured Post

Your container's CPU usage is more than 100%, "kdevtmpfsi" the cryptomining malware is running | So how to resolve

How to resolve when "kdevtmpfsi" the crypto-mining malware is running and taking all CPU load of your server (container). One d...

Showing posts with label Development. Show all posts
Showing posts with label Development. Show all posts

Sunday, March 27, 2022

How to update and insert 100k CSV lines into the 10M records MS SQL Server database table by a cron job in a few seconds?

 How to update and insert 100k CSV lines into the 10M records MS SQL Server database table by a cron job in a few seconds?

Context

In my application, there is a CUSTOMER table containing more than 10 million records. And daily, another application will export CRM data to a CSV file and upload it to an S3 bucket. This CSV file usually contains about 30 thousand lines. In there, more than 1 thousand lines are new customers and the remaining are existing customers in the CUSTOMER table need to be updated their information. 1 cron job reads the CSV file and updates/inserts it to the CUSTOMER table. The issue here is if we update each of nearly 30k existing records into the CUSTOMER table with 10M records existing in there because it will take a few hours to complete. After googling for some solutions, I got a suggestion with the concept INSERT ON DUPLICATE KEY UPDATE. But this solution is usually only used in MySQL or PostgreSQL and not supported by MS SQL Server, instead of this, MS SQL Server has another solution to resolve this issue, which is the MERGE concept. Then, I have applied to my application then I take notes in a short demo here to reuse later and share someone finding solutions to resolve similar issues.

Idea

  1. Just insert all CSV lines to a temporary table `CUSTOMER_TEMP` by using the MyBatis batch insert.
  2. Call a SQL script by MyBatis ScriptRunner to merge data from the `CUSTOMER_TEMP` to the `CUSTOMER` table by the internal database engine.
In the real project, I'm using spring-boot-start-batch, but for demo purposes, I only write a short Java class to read the CSV file customers_100k_lines.csv from the current project folder instead of the S3 bucket, then batch insert to CUSTOMER_TEMP and call ScriptRunner to MERGE the data to the CUSTOMER table. The CSV file contains 50k lines are existing records in the CUSTOMER table should be updated and 500k are new ones, should be inserted. I'm also prepared 2 SQL scripts to create the CUSTOMER_TEMP and CUSTOMER table with 10M records existing. If you want to run this source code, please correct the database information in the application.yml file, then you can check the total time spent, and the time spent for batch insert and data merging in log files. Hope this is helpful for someone.




Logs:
Total records in CUSTOMER table before: 10000000
Total CSV lines: 100000. Total inserted to CUSTOMER_TEMP table: 100000 by 10 batchs. Time spent: 15589 (ms).
Time spent to merge data from CUSTOMER_TEMP to CUSTOMER table: 1325 (ms).
Total records in CUSTOMER table after: 10100000
References:
https://ichihedge.wordpress.com/2020/01/12/mybatis-save-or-update/
https://www.sqlshack.com/understanding-the-sql-merge-statement/
https://docs.microsoft.com/en-us/sql/t-sql/statements/merge-transact-sql?view=sql-server-ver15

Sunday, November 24, 2019

QuickStart: The fastest and easiest way to run an Local SMTP Server for testing or development.

Run Local SMTP Server on your development environment.

Your code needs to run the test cases with send email from SMTP server but you have no SMTP Mail Server on your local and you get the error below:


> :bootRun
org.springframework.mail.MailSendException: Mail server connection failed; nested exception is com.sun.mail.util.MailConnectException: Couldn't connect to host, port: localhost, 25; timeout -1;
  nested exception is:
        java.net.ConnectException: Connection refused (Connection refused). Failed messages: com.sun.mail.util.MailConnectException: Couldn't connect to host, port: localhost, 25; timeout -1;
  nested exception is:
        java.net.ConnectException: Connection refused (Connection refused)
        at org.springframework.mail.javamail.JavaMailSenderImpl.doSend(JavaMailSenderImpl.java:448)
        at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:361)
        at org.springframework.mail.javamail.JavaMailSenderImpl.send(JavaMailSenderImpl.java:356)


Option 1: Via Docker

- Install Docker If you not yet.
- Register a Docker ID https://hub.docker.com/ if you don't have.
- Login Docker Hub.
- Skip this step if Docker already installed on your Mac/PC and you already logged in Docker Hub.







Pull and run djfarrelly/maildev Image from Docker Hub.

- Search for "djfarrelly/maildev" in Docker Hub and Checkout the document for more details.



- Run the command below to starting a Local SMTP Mail Server Instance as a Docker container named "MailDev" and mapping port 25(SMTP) and 80(Web) from the container to port 25 and 1080 on your Docker host (your PC/Mac). This will take you a few minutes to download the image and Starting the container.

docker run -d -it --name MailDev -p 1080:80 -p 25:25  djfarrelly/maildev




- Now re-run your test cases again:

2019-11-23 23:26:43.499 DEBUG 7573 --- [sbrgprjs-task-1] me.koacervate.aop.logging.LoggingAspect  : Enter: me.koacervate.service.MailService.sendCreationEmail() with argument[s] = [User{login='koacervate', firstName='Khue', lastName='Duke', email='koacervate@googlemail.com', imageUrl='null', activated='true', langKey='en', activationKey='null'}]
2019-11-23 23:26:43.500 DEBUG 7573 --- [sbrgprjs-task-1] me.koacervate.service.MailService        : Sending creation email to 'koacervate@googlemail.com'
2019-11-23 23:26:43.526 DEBUG 7573 --- [sbrgprjs-task-1] me.koacervate.service.MailService        : Send email[multipart 'false' and html 'true'] to 'koacervate@googlemail.com' with subject 'jsbrgprjs account activation' and content=<!DOCTYPE html>
<html lang="en">
    <head>
        <title>jsbrgprjs account activation</title>
        <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
        <link rel="shortcut icon" href="http://127.0.0.1:8080/favicon.ico" />
    </head>
    <body>
        <p>Dear koacervate</p>
        <p>Your jsbrgprjs account has been created, please click on the URL below to access it:</p>
        <p>
            <a href="http://127.0.0.1:8080/account/reset/finish?key=RCTVtLkilrCdawcCDYKf">http://127.0.0.1:8080/account/reset/finish?key=RCTVtLkilrCdawcCDYKf</a>
        </p>
        <p>
            <span>Regards,</span>
            <br/>
            <em>jsbrgprjs Team.</em>
        </p>
    </body>
</html>

2019-11-23 23:26:44.073 DEBUG 7573 --- [sbrgprjs-task-1] me.koacervate.service.MailService        : Sent email to User 'koacervate@googlemail.com'

2019-11-23 23:26:44.073 DEBUG 7573 --- [sbrgprjs-task-1] me.koacervate.aop.logging.LoggingAspect  : Exit: me.koacervate.service.MailService.sendCreationEmail() with result = null

- You can see the sent email in Web UI http://localhost:1080/#/email/JevGMzCd


Option 2: Via install maildev node package as below:

$ npm install -g maildev
$ maildev
with options as below:

Usage

maildev [options]

  -h, --help                      output usage information
  -V, --version                   output the version number
  -s, --smtp <port>               SMTP port to catch emails [1025]
  -w, --web <port>                Port to run the Web GUI [1080]
  --ip <ip address>               IP Address to bind SMTP service to
  --outgoing-host <host>          SMTP host for outgoing emails
  --outgoing-port <port>          SMTP port for outgoing emails
  --outgoing-user <user>          SMTP user for outgoing emails
  --outgoing-pass <password>      SMTP password for outgoing emails
  --outgoing-secure               Use SMTP SSL for outgoing emails
  --auto-relay [email]            Use auto-relay mode. Optional relay email address
  --auto-relay-rules <file>       Filter rules for auto relay mode
  --incoming-user <user>          SMTP user for incoming emails
  --incoming-pass <pass>          SMTP password for incoming emails
  --web-ip <ip address>           IP Address to bind HTTP service to, defaults to --ip
  --web-user <user>               HTTP user for GUI
  --web-pass <password>           HTTP password for GUI
  --base-pathname <path>          base path for URLs
  --disable-web                   Disable the use of the web interface. Useful for unit testing
  --hide-extensions <extensions>  Comma separated list of SMTP extensions to NOT advertise
                                  (STARTTLS, SMTPUTF8, PIPELINING, 8BITMIME)
  -o, --open                      Open the Web GUI after startup
  -v, --verbose
  --silent
Please refer to documents at https://hub.docker.com/r/djfarrelly/maildev/ for more details.

Option 3: Integrate via API or SMTP with https://sendgrid.com/

You can register a free account for testing and choose to integrate via API or SMTP then follow the instructions in the website you will have all necessary information to send emails with your code.

  mail:
    host: smtp.sendgrid.net
    port: 465
    username: apikey
    password: SG.t2L6qpv3TjmvgcasoksVbg.DRsZrTwq3LmApU1uaQ6mjv-koafl0NaSzogCW6X9***
    properties:
      mail:
        smtp:
          auth: true
          starttls:
            enable: true


References:
- https://docs.docker.com
https://hub.docker.com/r/djfarrelly/maildev/
https://github.com/maildev/maildev

Saturday, January 26, 2019

QuickStart: The fastest and easiest way to run an Instance of Oracle Database on Mac OS

Run Oracle Database on your Mac.


Need an instance of Oracle Database Server running on your Mac for your testing or development purpose?
-> Via Docker and Oracle Database Enterprise Edition Image with few steps in minutes.

Install Docker CE for Mac and log in to Docker Hub.

- Register a Docker ID if you don't have.
- Skip this step if Docker already installed on your Mac and you already logged in Docker Hub.







Pull and run Oracle Database Enterprise Edition Image from Docker Hub.

- Search for "Oracle Database Enterprise Edition" in Docker Hub and Checkout the document for more details.




- Run the command below to starting an Oracle Database Server Instance as a Docker container named "Oracle-DB" and mapping port 1521 from the container to port 1521 on your Docker host (your Mac). This will take you a few minutes to download the image and Starting the container.

docker run -d -it --name Oracle-DB -p 1521:1521/tcp  store/oracle/database-enterprise:12.2.0.1



- After the download is done and the container started, you can check the image already pulled and the container running as below:



- Now you can connect to the Oracle Instance via Oracle SQL Developer Tool with username: "sys as sysdba", password: Oradoc_db1 as below:



- Ok, So now you already have an instance of Oracle Database server in your Mac for testing or development. When you don't want the instance running and saving your Mac performance, you can stop the container or want to start the instance again as below:


- If your project is done and you don't want to use the instance any more, you should stop the container and delete the image as below:


- And remove unused volume:
docker volume prune

Error Notes: When you connect from Oracle SQL Developer Tool to the Oracle Database Server Instance, if you get a message "Status: Failure -Test failed: IO Error: Got minus one from a read call" or "Status: Failure -Test failed: Listener refused the connection with the following error:
ORA-12505, TNS: listener does not currently know of SID given in connect descriptor" -> So please wait in moments, the instance is preparing in the container, you can connect success very soon after that, don't worry :). Good luck and fun.

References:
- https://docs.docker.com
- https://hub.docker.com/_/oracle-database-enterprise-edition?tab=resources


Sunday, April 23, 2017

Get error: fatal error C1034: "stdio.h: no include path set" when you compile a C/C++ source code file.



1
2
3
4
5
6
7
F:\
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\bin\HostX64\x64\cl.exe" chrAprMax_byAdd.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.10.25019 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

chrAprMax_byAdd.c
chrAprMax_byAdd.c(1): fatal error C1034: stdio.h: no include path set

- Please make sure you run "Developer Command Prompt for VS 20xx" or set Path variable includes "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.10.25017\bin\HostX64\x64\" before that and run cl.exe in the window opened by "Developer Command Prompt for VS 20xx".


or



-> So now you can compile C/C++ source file without error: "fatal error C1034: stdio.h: no include path set" :)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
F:\>cl chrAprMax.c
Microsoft (R) C/C++ Optimizing Compiler Version 19.10.25019 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

chrAprMax.c
Microsoft (R) Incremental Linker Version 14.10.25019.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:chrAprMax.exe
chrAprMax.obj

F:\>chrAprMax
Enter a string: just for fun
The character that appears the most frequently is "u" with 2 matched.

F:\>cl chrAprMax.cpp -o chrAprMaxpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.10.25019 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

cl : Command line warning D9035 : option 'o' has been deprecated and will be rem
oved in a future release
chrAprMax.cpp
Microsoft (R) Incremental Linker Version 14.10.25019.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:chrAprMax.exe
/out:chrAprMaxpp.exe
chrAprMax.obj

F:\>chrAprMaxpp
Enter a string: just for fun
The character that appears the most frequently is "u" with 2 matched.