Do not input private or sensitive data. View Qlik Privacy & Cookie Policy.
Skip to main content

Announcements
Qlik Open Lakehouse is Now Generally Available! Discover the key highlights and partner resources here.
cancel
Showing results for 
Search instead for 
Did you mean: 
Not applicable

Website data extract

Hello,

I am trying to compile a local copy of the PDGA player statistics.  This is available on their website and I can ingest this into Qlikview but only 1 page at a time and there are 1500 current pages and will be more in the future.  Is there a way to better have this run in the script so that it will pull all 1500 pages and not just the first one?  Here is the website I am trying to pull the data from - PDGA Player Statistics | Professional Disc Golf Association

LOAD Name,

     [PDGA #],

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats?order=Rating&sort=Desc]

(html, codepage is 1252, embedded labels, table is @1);

1 Solution

Accepted Solutions
robert_mika
Master III
Master III

Create a variable that will change a part of the link related to a page 

...http://www.pdga.com/players/stats?Year=2016&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All...3&order=Rating&sort=Desc

and the loop thru all the pages

Loops in the Script

not an easy task but possible...

View solution in original post

4 Replies
robert_mika
Master III
Master III

Create a variable that will change a part of the link related to a page 

...http://www.pdga.com/players/stats?Year=2016&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All...3&order=Rating&sort=Desc

and the loop thru all the pages

Loops in the Script

not an easy task but possible...

el_aprendiz111
Specialist
Specialist

Hi,

web.png

effinty2112
Master
Master

Hi Jason,

          Try:

for i = 0 to 10

if i = 0 then

Let vWebPage = 'http://www.pdga.com/players/stats?Year=All&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All&...

ELSE

Let vWebPage = 'http://www.pdga.com/players/stats?Year=All&Class=P&Gender=All&Bracket=All&Country=All&StateProv=All&...' & $(i);

End if;

Data:

LOAD Name,

     [PDGA #],

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[$(vWebPage)]

(html, codepage is 1252, embedded labels, table is @1);

Next i;

This loads the first eleven pages of stats. Since there are a total of 3918 pages (i goes from 0 to 3917) I would do this in stages for a few hundred and store to qvd until you have them all.

Cheers

Andrew

Not applicable
Author

I was able to do it like this.

SET ThousandSep=',';

SET DecimalSep='.';

SET MoneyThousandSep=',';

SET MoneyDecimalSep='.';

SET MoneyFormat='$#,##0.00;($#,##0.00)';

SET TimeFormat='h:mm:ss TT';

SET DateFormat='M/D/YYYY';

SET TimestampFormat='M/D/YYYY h:mm:ss[.fff] TT';

SET MonthNames='Jan;Feb;Mar;Apr;May;Jun;Jul;Aug;Sep;Oct;Nov;Dec';

SET DayNames='Mon;Tue;Wed;Thu;Fri;Sat;Sun';

SET a=1;

[Player Stats]:

LOAD Name,

     [PDGA #] as PDGANum,

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats]

(html, codepage is 1252, embedded labels, table is @1);

For a=1 to 1499

Concatenate ([Player Stats])

[Player Stats]:

LOAD Name,

     [PDGA #] as PDGANum,

     Rating,

     Year,

     Class,

     Gender,

     Bracket,

     Country,

     [State/Prov],

     Events,

     Points,

     Cash

FROM

[http://www.pdga.com/players/stats?Year=2016&Class=All&Gender=All&Bracket=All&Country=All&StateProv=A...)]

(html, codepage is 1252, embedded labels, table is @1);

Next

Store [Player Stats] into AllPlayerDataPull.qvd;