Skip to content
Projects
Groups
Snippets
Help
This project
Loading...
Sign in / Register
Toggle navigation
V
venjob
Overview
Overview
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
1
Issues
1
List
Board
Labels
Milestones
Merge Requests
3
Merge Requests
3
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
Xuan Trung Le
venjob
Commits
eeeefc0a
Commit
eeeefc0a
authored
Oct 10, 2017
by
Xuan Trung Le
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
fix bugs
parent
a3c73032
Show whitespace changes
Inline
Side-by-side
Showing
5 changed files
with
47 additions
and
39 deletions
+47
-39
app/data/crawler.rb
+7
-7
app/models/city.rb
+0
-9
app/models/job.rb
+8
-15
db/seeds.rb
+31
-7
lib/tasks/data.rake
+1
-1
No files found.
app/data/crawler.rb
View file @
eeeefc0a
...
@@ -8,10 +8,9 @@ class Crawler
...
@@ -8,10 +8,9 @@ class Crawler
LIST_URL
=
"
#{
BASE_CAREERBUILDER_URL
}
/viec-lam"
LIST_URL
=
"
#{
BASE_CAREERBUILDER_URL
}
/viec-lam"
def
self
.
crawl_job_infomation
(
job_links
)
def
self
.
crawl_job_infomation
(
job_links
)
links
=
job_links
job_details
=
[]
job_details
=
[]
links
.
each
do
|
link
|
job_
links
.
each
do
|
link
|
puts
"Fetching
#{
link
}
..."
puts
"Fetching
#{
link
}
..."
params
=
{}
params
=
{}
link
=
URI
.
escape
(
link
)
link
=
URI
.
escape
(
link
)
...
@@ -26,7 +25,8 @@ class Crawler
...
@@ -26,7 +25,8 @@ class Crawler
job_details
<<
params
job_details
<<
params
end
end
end
end
return
job_details
job_details
end
end
def
self
.
use_template_default
(
doc
,
link
)
def
self
.
use_template_default
(
doc
,
link
)
...
@@ -72,7 +72,7 @@ class Crawler
...
@@ -72,7 +72,7 @@ class Crawler
# original_link
# original_link
params
[
:original_link
]
=
link
params
[
:original_link
]
=
link
return
params
params
end
end
def
self
.
crawl_company_infomation
(
doc
)
def
self
.
crawl_company_infomation
(
doc
)
...
@@ -85,12 +85,12 @@ class Crawler
...
@@ -85,12 +85,12 @@ class Crawler
end
end
params
[
:name
]
||=
'Bảo mật'
params
[
:name
]
||=
'Bảo mật'
return
params
params
end
end
def
self
.
get_job_link
def
self
.
get_job_link
s
url
=
"
#{
LIST_URL
}
/tat-ca-viec-lam-trang-
#{
1
}
-vi.html"
url
=
"
#{
LIST_URL
}
/tat-ca-viec-lam-trang-
#{
1
}
-vi.html"
doc
=
Nokogiri
::
HTML
(
open
(
url
))
doc
=
Nokogiri
::
HTML
(
open
(
url
))
return
doc
.
css
(
'.gird_standard .brief .jobtitle .job a'
).
map
{
|
a
|
a
[
'href'
]
}.
compact
.
uniq
doc
.
css
(
'.gird_standard .brief .jobtitle .job a'
).
map
{
|
a
|
a
[
'href'
]
}.
compact
.
uniq
end
end
end
end
app/models/city.rb
View file @
eeeefc0a
...
@@ -2,13 +2,4 @@ class City < ApplicationRecord
...
@@ -2,13 +2,4 @@ class City < ApplicationRecord
belongs_to
:country
,
optional:
true
belongs_to
:country
,
optional:
true
has_and_belongs_to_many
:companies
has_and_belongs_to_many
:companies
has_and_belongs_to_many
:jobs
has_and_belongs_to_many
:jobs
def
self
.
list
[
'AN GIANG'
,
'BÀ RỊA - VŨNG TÀU'
,
'BẮC GIANG'
,
'BẮC KẠN'
,
'BẠC LIÊU'
,
'BẮC NINH'
,
'BẾN TRE'
,
'BÌNH ĐỊNH'
,
'BÌNH DƯƠNG'
,
'BÌNH PHƯỚC'
,
'BÌNH THUẬN'
,
'CÀ MAU'
,
'CAO BẰNG'
,
'DAK LAK'
,
'DAK NÔNG'
,
'ĐIỆN BIÊN'
,
'ĐỒNG NAI'
,
'ĐỒNG THÁP'
,
'GIA LAI'
,
'HÀ GIANG'
,
'HÀ NAM'
,
'HÀ TĨNH'
,
'HẢI DƯƠNG'
,
'HẬU GIANG'
,
'HÒA BÌNH'
,
'HƯNG YÊN'
,
'KHÁNH HÒA'
,
'KIÊN GIANG'
,
'KON TUM'
,
'LAI CHÂU'
,
'LÂM ĐỒNG'
,
'LẠNG SƠN'
,
'LÀO CAI'
,
'LONG AN'
,
'NAM ĐỊNH'
,
'NGHỆ AN'
,
'NINH BÌNH'
,
'NINH THUẬN'
,
'PHÚ THỌ'
,
'QUẢNG BÌNH'
,
'QUẢNG NAM'
,
'QUẢNG NGÃI'
,
'QUẢNG NINH'
,
'QUẢNG TRỊ'
,
'SÓC TRĂNG'
,
'SƠN LA'
,
'TÂY NINH'
,
'THÁI BÌNH'
,
'THÁI NGUYÊN'
,
'THANH HÓA'
,
'THỪA THIÊN- HUẾ'
,
'TIỀN GIANG'
,
'TRÀ VINH'
,
'TUYÊN QUANG'
,
'VĨNH LONG'
,
'VĨNH PHÚC'
,
'YÊN BÁI'
,
'PHÚ YÊN'
,
'CẦN THƠ'
,
'ĐÀ NẴNG'
,
'HẢI PHÒNG'
,
'HÀ NỘI'
,
'HỒ CHÍ MINH'
,
'KV BẮC TRUNG BỘ'
,
'KV ĐÔNG NAM BỘ'
,
'KV NAM TRUNG BỘ'
,
'KV TÂY NGUYÊN'
]
end
end
end
app/models/job.rb
View file @
eeeefc0a
...
@@ -8,13 +8,11 @@ class Job < ApplicationRecord
...
@@ -8,13 +8,11 @@ class Job < ApplicationRecord
has_and_belongs_to_many
:cities
has_and_belongs_to_many
:cities
def
self
.
create_new_jobs
(
arr_jobs
)
def
self
.
create_new_jobs
(
arr_jobs
)
viet_nam
=
Country
.
find_or_create_by
(
name:
'Viet Nam'
)
another
=
Country
.
find_or_create_by
(
name:
'another'
)
arr_jobs
.
each
do
|
item
|
arr_jobs
.
each
do
|
item
|
job_cities
=
[]
job_cities
=
[]
city_name
=
[]
city_name
s
=
[]
job_industries
=
[]
job_industries
=
[]
industry_name
=
[]
industry_name
s
=
[]
job
=
Job
.
new
(
name:
item
[
:name
],
job
=
Job
.
new
(
name:
item
[
:name
],
salary:
item
[
:salary
],
salary:
item
[
:salary
],
description:
item
[
:description
],
description:
item
[
:description
],
...
@@ -25,17 +23,12 @@ class Job < ApplicationRecord
...
@@ -25,17 +23,12 @@ class Job < ApplicationRecord
updated_date:
item
[
:updated_date
])
updated_date:
item
[
:updated_date
])
# City
# City
unless
item
[
:city
].
blank?
unless
item
[
:city
].
blank?
city_name
=
item
[
:city
].
split
(
','
).
map
(
&
:strip
)
city_name
s
=
item
[
:city
].
split
(
','
).
map
(
&
:strip
)
job_cities
=
City
.
where
(
name:
city_name
)
job_cities
=
City
.
where
(
name:
city_name
s
)
job_cities
.
each
do
|
city
|
job_cities
.
each
do
|
city
|
job
.
cities
<<
city
job
.
cities
<<
city
end
end
city_name
=
city_name
-
job_cities
.
pluck
(
:name
)
city_name
.
each
do
|
name
|
job
.
cities
<<
City
.
create
(
name:
name
,
country:
City
.
list
.
include?
(
name
.
upcase
)
?
viet_nam
:
another
)
end
end
end
# Company
# Company
...
@@ -46,15 +39,15 @@ class Job < ApplicationRecord
...
@@ -46,15 +39,15 @@ class Job < ApplicationRecord
# Industry
# Industry
unless
item
[
:industry
].
blank?
unless
item
[
:industry
].
blank?
industry_name
=
item
[
:industry
].
split
(
','
).
map
(
&
:strip
)
industry_name
s
=
item
[
:industry
].
split
(
','
).
map
(
&
:strip
)
job_industries
=
Industry
.
where
(
name:
industry_name
)
job_industries
=
Industry
.
where
(
name:
industry_name
s
)
job_industries
.
each
do
|
industry
|
job_industries
.
each
do
|
industry
|
job
.
industries
<<
industry
job
.
industries
<<
industry
end
end
industry_name
=
industry_name
-
job_industries
.
pluck
(
:name
)
industry_name
s
=
industry_names
-
job_industries
.
pluck
(
:name
)
industry_name
.
each
do
|
name
|
industry_name
s
.
each
do
|
name
|
job
.
industries
<<
Industry
.
create
(
name:
name
)
job
.
industries
<<
Industry
.
create
(
name:
name
)
end
end
end
end
...
...
db/seeds.rb
View file @
eeeefc0a
# This file should contain all the record creation needed to seed the database with its default values.
viet_nam
=
Country
.
create
(
name:
'Viet Nam'
)
# The data can then be loaded with the rails db:seed command (or created alongside the database with db:setup).
another
=
Country
.
create
(
name:
'another'
)
#
# Examples:
cities_of_vn
=
[
"Hà Nội"
,
"Hồ Chí Minh"
,
"An Giang"
,
"Bà Rịa - Vũng Tàu"
,
"Bạc Liêu"
,
"Bắc Giang"
,
#
"Bắc Ninh"
,
"Bến Tre"
,
"Bình Dương"
,
"Bình Định"
,
"Bình Phước"
,
"Bình Thuận"
,
"Cà Mau"
,
# movies = Movie.create([{ name: 'Star Wars' }, { name: 'Lord of the Rings' }])
"Cao Bằng"
,
"Cần Thơ"
,
"Dak Lak"
,
"Dak Nông"
,
"Đà Nẵng"
,
"Điện Biên"
,
# Character.create(name: 'Luke', movie: movies.first)
"Đồng Bằng Sông Cửu Long"
,
"Đồng Nai"
,
"Đồng Tháp"
,
"Gia Lai"
,
"Hà Giang"
,
"Hà Nam"
,
"Hà Tây"
,
"Hà Tĩnh"
,
"Hải Dương"
,
"Hải Phòng"
,
"Hậu Giang"
,
"Hòa Bình"
,
"Hưng Yên"
,
"Khác"
,
"Khánh Hòa"
,
"Kiên Giang"
,
"Kon Tum"
,
"KV Bắc Trung Bộ"
,
"KV Đông Nam Bộ"
,
"KV Nam Trung Bộ"
,
"KV Tây Nguyên"
,
"Lai Châu"
,
"Lạng Sơn"
,
"Lào Cai"
,
"Long An"
,
"Nam Định"
,
"Nghệ An"
,
"Ninh Thuận"
,
"Phú Thọ"
,
"Phú Yên"
,
"Quảng Bình"
,
"Quảng Nam"
,
"Quảng Ngãi"
,
"Quảng Ninh"
,
"Quảng Trị"
,
"Sóc Trăng"
,
"Sơn La"
,
"Tây Ninh"
,
"Thái Bình"
,
"Thái Nguyên"
,
"Thanh Hóa"
,
"Thừa Thiên- Huế"
,
"Tiền Giang"
,
"Toàn quốc"
,
"Trà Vinh"
,
"Tuyên Quang"
,
"Vĩnh Long"
,
"Vĩnh Phúc"
,
"Yên Bái"
]
cities_of_another_country
=
[
"Banteay Meanchey"
,
"Battambang"
,
"Kampong Chhnang"
,
"Kampong Speu"
,
"Kampot"
,
"Kandal"
,
"Kâmpóng Thum, Cambodia"
,
"Kep"
,
"Koh Kong"
,
"Kratie"
,
"Mondulkiri"
,
"Otdar Meanchey"
,
"Pailin"
,
"Phnompenh"
,
"Preah Sihanouk"
,
"Preah Vihear"
,
"Prey Veng"
,
"Pursat"
,
"Rotanak Kiri"
,
"Siem Reap"
,
"Sihanoukville"
,
"Stung Treng"
,
"Svay Rieng"
,
"Tbong Khmum"
,
"Kinshasa"
,
"Hồng Kông"
,
"Attapeu"
,
"Bokeo"
,
"Bolikhamsai"
,
"Champasak"
,
"Houaphanh"
,
"Khammouane"
,
"Louang Namtha"
,
"Luang Prabang"
,
"Oudomxay"
,
"Phongsaly"
,
"Sainyabuli"
,
"Salavan"
,
"Savannakhet"
,
"Sekong"
,
"Vientiane"
,
"Xaisomboun"
,
"Xiangkhouang"
,
"Qatar"
]
cities_of_vn
.
each
do
|
city_name
|
City
.
create
(
name:
city_name
,
country:
viet_nam
)
end
cities_of_another_country
.
each
do
|
city_name
|
City
.
create
(
name:
city_name
,
country:
another
)
end
lib/tasks/data.rake
View file @
eeeefc0a
...
@@ -2,7 +2,7 @@ require "./app/data/crawler.rb"
...
@@ -2,7 +2,7 @@ require "./app/data/crawler.rb"
namespace
:data
do
namespace
:data
do
task
insert_job: :environment
do
|
t
|
task
insert_job: :environment
do
|
t
|
links
=
Crawler
.
get_job_link
links
=
Crawler
.
get_job_link
s
links
=
Job
.
filter_link_exist
(
links
)
links
=
Job
.
filter_link_exist
(
links
)
@data
=
Crawler
.
crawl_job_infomation
(
links
)
@data
=
Crawler
.
crawl_job_infomation
(
links
)
Job
.
create_new_jobs
(
@data
)
Job
.
create_new_jobs
(
@data
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment