Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
V
venjob_nth
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
3
Merge Requests
3
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Ngô Trung Hưng
venjob_nth
Commits
8fecd429
Commit
8fecd429
authored
Jul 28, 2020
by
Ngô Trung Hưng
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix -part 4
parent
716b0bd9
Pipeline
#725
failed with stages
in 0 seconds
Changes
2
Pipelines
1
Show whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
24 additions
and
26 deletions
+24
-26
lib/src/interface_web.rb
+23
-25
lib/tasks/crawler.rake
+1
-1
No files found.
lib/src/interface_web.rb
View file @
8fecd429
...
...
@@ -3,7 +3,7 @@
require
'open-uri'
# Crawler data
class
InterfaceWeb
class
Crawler
COMPANY_SECURITY
=
1
NUMBER_LINK
=
1
SIZE_LI_INTERFACE_5
=
10
...
...
@@ -92,6 +92,28 @@ class InterfaceWeb
end
end
def
make_data
puts
'Please wait for crawl jobs data! . . .'
link_crawl
=
link_job_and_companies
arr_link
=
[]
link_crawl
[
1
].
each
do
|
val
|
break
if
@@stop_crawl
==
val
arr_link
<<
val
end
arr_link
.
reverse!
.
each_with_index
do
|
path
,
i
|
page
=
Nokogiri
::
HTML
(
URI
.
open
(
URI
.
parse
(
URI
.
escape
(
path
))))
if
page
.
search
(
'.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p'
)[
0
].
present?
crawl_data_jobs_interface_1
(
page
)
elsif
page
.
search
(
'section .template-200'
).
text
.
present?
crawl_data_jobs_interface_2
(
page
)
elsif
page
.
search
(
'.DetailJobNew ul li'
).
size
==
SIZE_LI_INTERFACE_5
&&
!
page
.
search
(
'.right-col ul li'
).
text
.
include?
(
'Độ tuổi'
)
crawl_data_jobs_interface_5
(
page
)
end
puts
"
#{
i
}
-
#{
path
}
"
end
puts
'Crawler data jobs success!'
end
private
def
add_data
(
data
)
...
...
@@ -201,28 +223,4 @@ class InterfaceWeb
CityJob
.
create!
(
job_id:
id_job
,
city_id:
id_cities
)
end
end
public
def
make_data
puts
'Please wait for crawl jobs data! . . .'
link_crawl
=
link_job_and_companies
arr_link
=
[]
link_crawl
[
1
].
each
do
|
val
|
break
if
@@stop_crawl
==
val
arr_link
<<
val
end
arr_link
.
reverse!
.
each_with_index
do
|
path
,
i
|
page
=
Nokogiri
::
HTML
(
URI
.
open
(
URI
.
parse
(
URI
.
escape
(
path
))))
if
page
.
search
(
'.item-blue .detail-box:nth-child(1) ul li:nth-child(1) p'
)[
0
].
present?
crawl_data_jobs_interface_1
(
page
)
elsif
page
.
search
(
'section .template-200'
).
text
.
present?
crawl_data_jobs_interface_2
(
page
)
elsif
page
.
search
(
'.DetailJobNew ul li'
).
size
==
SIZE_LI_INTERFACE_5
&&
!
page
.
search
(
'.right-col ul li'
).
text
.
include?
(
'Độ tuổi'
)
crawl_data_jobs_interface_5
(
page
)
end
puts
"
#{
i
}
-
#{
path
}
"
end
puts
'Crawler data jobs success!'
end
end
lib/tasks/crawler.rake
View file @
8fecd429
...
...
@@ -10,7 +10,7 @@ namespace :crawler do
company
.
address
=
'Vui lòng xem trong mô tả công việc'
company
.
short_description
=
'Vui lòng xem trong mô tả công việc'
end
cw
=
InterfaceWeb
.
new
cw
=
Crawler
.
new
cw
.
craw_data_cities
cw
.
craw_data_companies
cw
.
make_data
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment