Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Join us in Bucharest on Sept 18th for Qlik's AI Reality Tour! Register Now
cancel
Showing results for 
Search instead for 
Did you mean: 
WSyahirah21
Creator
Creator

Fetch and store result from log.info() into mysql database using talend studio

Currently, my spark job managed to output result from the log.info() function. What I was trying to do now is to fetch and store the output into a proper mysql database using talend component. 

 

Result from spark:

INFO:spark2:Source rows = 2692687

INFO:jspark2:Destination rows = 2692687

 

My idea is to use these workflow to achieve the task objective:

tSystem (submit spark job) >> tFileInputDelimited (fetch the printing result from spark) >> <some mysql component to store result to msql>  

 

Is it possible to do this way? 

Labels (3)
2 Replies
Anonymous
Not applicable

Hello,

 

You can read and parse the logs produced by spark and use a tFileInputDelimited -> tFilterRow -> tDBOutput(MySQL)

 

Another option would be to use the outputLine of tSystem if the job you call produces console logs.

 

 

Log4j messages themselves could be forwarded to a database. One would have to use JDBCAppender for log4j.

In case of Log4J2 the config I used (for a POC!) was:

 

  <JDBC name="dbAppender" tableName="log4j2.all_log" connectionSource="PoolingDriver">

<DriverManager connectionString="jdbc:postgresql://localhost:5432/postgres" 

driverClassName="org.postgresql.Driver" username="postgres" password=":)" />

 

    <Column name="moment" isEventTimestamp="true" />

    <Column name="origin"      isUnicode="false" pattern="%replace{%msg}{(.+) - (.*)}{$1}" />

    <Column name="message" isUnicode="false" pattern="%replace{%msg}{(.+) - (.*)}{$2}" />

    <Column name="mdc" isUnicode="false" pattern="%X" />

    <Column name="level" isUnicode="false" pattern="%level" />

<Filters>

<MapFilter onMatch="NEUTRAL" onMismatch="ACCEPT">

<KeyValuePair key="_pid" value="0"/>

</MapFilter>

<RegexFilter regex="^(connectionStatsLogs|talendStats_|talendMeter_|talendLogs_|tLogRow_).+" onMatch="DENY" onMismatch="ACCEPT"/>

<RegexFilter regex=".+ - Parameters:.+" onMatch="DENY" onMismatch="ACCEPT"/>

</Filters>

  </JDBC>

This config is not for production use as it creates a lot of new connections.

 

WSyahirah21
Creator
Creator
Author

Hi,

currently I found other alternative to fetch the string output using:

tSystem_1 (print the output) -> tJava (get the output and bring to next component)

 

tSystem_1 output:

('source count:', '1000')

('destination count:', '100')

 

tJava code:

String output=((String)globalMap.get("tSystem_1_OUTPUT"));

System.out.println("Printing the error code 1 : "+StringUtils.substringBetween(output,"source count:", "destination count:"));

 

tJava output:

Printing the error code 1 : ', '1000')

('

 

However, tJava takes the result exactly between string " source count:' " and " destination count: ", so that included the bracket and all. The expected result is only to get the value 1000.

 

How do I apply this java code to fetch the correct output?