GraphLab

57
A presentation by Tushar Sudhakar Jee A Distributed Framework for Machine Learning and Data Mining in the Cloud

Transcript of GraphLab

Page 1: GraphLab

A presentation by Tushar Sudhakar Jee

A Distributed Framework for Machine Learning and Data Mining in the Cloud

Page 2: GraphLab

BulkSynchronousParallel(BSP)•AbridgingmodelfordesigningparallelAlgorithms(eg:messagerelaying).

•ImplementedbyGooglePregel2010.•Themodelconsistsof:

1.Concurrentcomputation:Everyparticipatingprocessormayperformlocalcomputations.2.Communication:Theprocesses

exchangedatabetweenthemselvestofacilitateremotedatastoragecapabilities.

3.Barriersynchronisation :Whenaprocessreachesthispoint(thebarrier),itwaitsuntilallotherprocesseshavereachedthesamebarrier.

Page 3: GraphLab

BulkSynchronousParallel(BSP)•Advantages:

1.NoworriesaboutRaceconditions.2.BarrierguaranteesDataconsistency.3.Simplertomakefaulttolerant,savedataonbarrier.

•Disadvantages:1.Costlyperformancepenaltiessinceruntimeofeachphaseisdecidedby

slowestmachine.2.Failtosupportthepropertiesofasynchronous,graph-parallelanddynamic

computation,criticaltoMachineLearningandDataMiningCommunity.

Page 4: GraphLab

Asynchronous processing•ImplementedbyGraphLab2010,2012.•Advantages:

1.Directlytargetspropertiesofasynchronous,graph-parallelanddynamiccomputation,criticaltoMachineLearningandDataMiningCommunity.

2.Involvesupdatingparametersusingmostrecentvaluesasinput,mostcloselyrelatedtoSequentialexecution.

•Disadvantages:1.Raceconditions canhappenallthetime.

Page 5: GraphLab

WhyGraphLab?•ImplementingMachineLearningandDataMiningalgorithmsinparalleloncurrentsystemslikeHadoop,MPIandMapReduceisprohibitivelycomplexandcostly.

•Ittargets asynchronous,dynamic,graph-parallelcomputationintheshared-memorysettingasneededbytheMLDMcommunity.

Page 6: GraphLab

MLDMAlgorithmProperties

•GraphStructuredComputation•AsynchronousIterativeComputation•DynamicComputation•Serializability

Page 7: GraphLab

GraphStructuredComputation•RecentadvancesinMLDMfocusonmodelingthedependenciesbetweendata,asitallowsextractingmoresignalfromnoisydata.Forexample,modelingthedependenciesbetweensimilarshoppersallowsustomakebetterproductrecommendationsthantreatingtheminisolation.

•Consequently,therehasbeenarecentshifttowardgraph-parallelabstractionslikePregelandGraphLabthatnaturallyexpressComputationaldependencies.

Page 8: GraphLab

AsynchronousIterativeComputation•Synchronoussystems updateallparameterssimultaneously(inparallel)usingparametervaluesfromtheprevioustimestepasinput.•Asynchronous systemsupdateparametersusingthemostrecentparametervaluesasinput.

ManyMLDMalgorithmsbenefitfromasynchronoussystems.

Page 9: GraphLab

DynamicComputation

•Dynamic computationallowsthealgorithmtosavetimesinceitonlyrecomputesverticeswithrecentlyupdatedneighbors.

•Static computationrequiresthealgorithmtoupdateallverticesequallyoften.Thiswastestimerecomputingverticesthathavealreadyconverged.

Page 10: GraphLab

Serializability

•Serializabilityensuresthatallparallelexecutionshaveanequivalentsequentialexecution,therebyeliminatingraceconditions.•MLDMalgorithmsconvergefasterifserializabilityisensured.Gibbssampling,requiresserializabilityforcorrectness.

Page 11: GraphLab

DistributedGraphLabAbstraction

•DataGraph•Updatefunction•SyncOperation•GraphLabExecutionModel•EnsuringSerializability

Page 12: GraphLab

DataGraph•TheGraphLababstractionstorestheprogramstateasadirectedgraphcalledthedatagraph,G=(V,E,D),whereD istheuserdefineddata.

Datahererepresentsmodelparameters,algorithmstate,andstatisticaldata.

Data graph

Page 13: GraphLab

DataGraph(PageRankexample):

Page 14: GraphLab

Updatefunction•Anupdatefunction isastatelessprocedurethatmodifiesthedatawithinthescopeofavertexandschedulesthefutureexecutionoftheupdatefunctionsonothervertices.

•Thefunctiontakesavertexv anditsscopeSvandreturnsnewversionsofthedatainthescopeaswellasasetverticesT:

Update: f(v,Sv)->(Sv,T)

Page 15: GraphLab

Updatefunction(PageRankexample):

•Theupdatefunction forPageRankcomputesaweightedsumofthecurrentranksofneighboringverticesandassignsitastherankofthecurrentvertex.

•TheNeighborsarescheduledforupdateonlyifthevalueofthecurrentvertexcrossesthethreshold.

Page 16: GraphLab

Updatefunction(PageRankexample):

Page 17: GraphLab

SyncOperation•Itisanassociativecommutativesumdefinedoverallscopesinthegraph.•SupportsnormalizationcommoninMLDMalgorithms.•Runscontinuouslyinthebackgroundtomaintainupdatedestimatesoftheglobalvalue.•Ensuringserializabilityofthesyncoperationiscostlyandrequiressynchronization.

Page 18: GraphLab

TheGraphLabExecutionModel•ThemodelallowstheGraphLabruntimeenginetodeterminebestorderinwhichtorunvertices.•SincemanyMLDMalgorithmsbenefitfromprioritization,GraphLababstractionallowsuserstoassignprioritiestoverticesinT.

Page 19: GraphLab

EnsuringSerializability•Itimpliesthatforeveryparallelexecution,thereexistsasequentialexecutionofupdatefunctionsthatwouldgivethesameresults.

•ForSerializabilityensurenooverlappinginscopesofconcurrentlyexecutingscopesofupdatefunctions.

•Thegreatertheconsistency,thelowertheparallelism.

Page 20: GraphLab

EnsuringSerializability(FullConsistency):•Thismodelensuresthatscopesofconcurrentlyexecutingupdatefunctionsdonotoverlap.

•Updatefunctionhascomplete read/writeaccesstoentirescope.•Concurrentlyexecutingupdatefunctionsmustbeatleasttwoverticesapartlimitingparallelism.

Page 21: GraphLab

EnsuringSerializability(EdgeConsistency):•Thismodelensureseachupdatefunctionhasexclusiveread/writeaccesstoitsvertexandadjacentedges,butreadonlyaccesstoadjacentvertices.

•Increases parallelismbyallowingupdatefunctionswithslightlyoverlappingscopestoruninparallel.

Page 22: GraphLab

EnsuringSerializability(VertexConsistency):

•Thismodelprovideswriteaccess tothecentralvertexdata.• Itallowsallupdatefunctionstoberuninparallel,providingmaximumparallelism.

•Itistheleastconsistent.

Page 23: GraphLab

UsingGraphLab(K-means):•UsingGraphLabCreateK-meanswithdatasetfromtheJune2014KagglecompetitiontoclassifyschizophrenicsubjectsbasedonMRIscans.

•Theoriginaldataconsistsoftwosetsoffeatures:functionalnetworkconnectivity(FNC)featuresandsource-basedmorphometry(SBM)features,incorporatedintoasingleSFrame withSFrame.join.•DatadownloadedfrompublicAWSS3bucket.

Page 24: GraphLab
Page 25: GraphLab
Page 26: GraphLab

DistributedGraphLabDesign•DistributedDataGraph•DistributedGraphLabEngines•Faulttolerance•Systemdesign

Page 27: GraphLab

DistributedDataGraph•Agraphispartitionedintok partswherek ismuchgreaterthanthenumberofmachines.

•Eachpart,calledanatom isstoredasaseparatefileonadistributedstoragesystem(AmazonS3).

Page 28: GraphLab

DistributedGraphLabEngines:1.EmulatestheGraphLabexecutionmodelandisresponsiblefor:

•Executingupdatefunctions.

•Executingsyncoperations.

•MaintainingthesetofscheduledverticesT.

•Ensuringserializabilitywithrespecttotheappropriateconsistencymodel2.Types:

•ChromaticEngine•DistributedLockingEngine

Page 29: GraphLab

•Itusesvertexcoloring tosatisfytheedgeconsistencymodelbyexecutingsynchronouslyallverticesofthesamecolorinthevertexsetTbeforeproceedingtothenextcolor.•Fullconsistencymodelissatisfiedbyensuringthatnovertexsharesthesamecolorasanyofitsdistancetwoneighbors.•Vertexconsistency modelissatisfiedbyassigningallverticesthesamecolor.•ItexecutesthesetofscheduledverticesTpartiallyasynchronously.

ChromaticEngine:

Edge Consistency model using Chromatic Engine.

Page 30: GraphLab

DistributedLockingEngine1.Whyuseit?

•Chromaticenginedoesnotprovidesufficientschedulingflexibility.•Chromaticenginepresupposesavailabilityofgraphcoloringwhichmightnotalwaysbereadilyavailable.

2.TheDistributedLockingEngine usesmutualexclusionbyassociatingareaders-writerslockwitheachvertex.3.Vertexconsistency isachievedbyacquiringawrite-lockonthecentralvertexofeachrequestedscope.

Page 31: GraphLab

4.Fullconsistencyisachievedbyacquiringwrite-locksonthecentralvertexandalladjacentvertices.5.Edgeconsistency isachievedbyacquiringawrite-lockonthecentralvertexandreadlocksonadjacentvertices.

Page 32: GraphLab

DistributedLockingEngine(Pipelinedlocking)

•Eachmachinemaintainsapipeline ofverticesforwhichlockshavebeenrequested,butnotyetfulfilled.•Thepipeliningsystemusescallbacksinsteadofreaders/writerlockssincethelatterwouldhaltthepipeline.

•Pipeliningreduceslatencybysynchronizinglockeddataimmediatelyafteramachinecompletesitslocallock.

Page 33: GraphLab

Chandy-LamportSnapshotAlgorithm

Page 34: GraphLab

FaultTolerance•UsingadistributedcheckpointmechanismcalledSnapshotUpdate faulttoleranceisintroducedinGraphLab.

•SnapshotUpdatecanbedeployedsynchronouslyorasynchronously.

•Asynchronoussnapshotsaremoreefficientandguaranteeconsistentsnapshotunderthefollowingconditions:a)Edgeconsistencyisusedonallupdatefunctions.b)Schedulecompletesbeforethescopeisunlocked.c)SnapshotUpdateisprioritizedoverotherupdates.

Page 35: GraphLab

Synchronoussnapshothavethe“flatline”whereasasynchronoussnapshotsallowcomputationtoproceed.

Page 36: GraphLab

Systemdesign● IntheInitializationphase theatomfilerepresentationofdatagraphis

constructed.● IntheGraphLabExecutionPhase atomfilesareassignedtoindividual

executionenginesfromtheDFS.

Page 37: GraphLab

Systemdesign(LockingEngineDesign)•PartitionofdistributedgraphmanagedwithinLocalGraphstorage.•Cacheusedtoprovideaccesstoremotedata.•SchedulermanagesverticesinTassignedtotheprocess.•Eachblockmakesuseofblockbelowit.

Page 38: GraphLab

Applications

● NetflixMovieRecommendation● VideoCo-segmentation(Coseg)● NamedEntityRecognition(NER)

Page 39: GraphLab

NetflixMovieRecommendation● Itmakesuseofcollaborativefilteringtopredict

themovieratingsforeachuserbasedontheratingsofsimilarusers.

● Thealternatingleastsquares(ALS)algorithmisusedtoiterativelycomputealow-rankmatrixfactorization.

● ThesparsematrixR definesabipartitegraphconnectingeachuserwiththemoviesthattheyrated.

● Vertices areusers(rowsU)andmovies(columnsV)andedges containtheratingsforauser-moviepair.

● GraphLabupdatefunctionpredictsratings(edge-values).

Page 40: GraphLab

•ALSrotatesbetweenfixingoneoftheunknownsuiorvj.Whenoneisfixedtheothercanbecomputedbysolvingthe least-squaresproblem. TheALSalgorithmisas:

Page 41: GraphLab

•R={rij }nu ×nv isuser-moviematrixwhereeachitemRijrepresentstheratingscoreofitemjbyuseriwhereri,j =<ui,vj>∀i,j.

•Urepresenttheuserfeaturematrix andVrepresentthemoviefeaturematrix.•Dimensionofthefeaturespace(d) isasystemparameterthatisdeterminedbyahold-outdatasetorcross- validation.• Thelowrankapproximationproblem isthusformulatedasfollowstolearnthefactorvectors(ui,vj):

Wherepi,j =<ui,vj >isthepredictedrating,λistheregularizationcoefficientandKistheSetofknownratingsfromtheSparsematrixR.

Page 42: GraphLab

NetflixScalingwithIntensity•Plottedisthespeedupachievedforvaryingvaluesofdimensionality(d).•Extrapolatingtoobtainthetheoreticallyoptimalruntime,theestimatedoverheadofDistributedGraphLabat64machinesis12xford=5and4.9xford=100.

Page 43: GraphLab

NetflixComparisons•GraphLabimplementationwascomparedagainstHadoopandMPIusingbetween4to64machines.

•GraphLabperformsbetween40-60timesfasterthanHadoop.•ItalsoslightlyoutperformedtheoptimizedMPIimplementation.

Page 44: GraphLab

VideoCo-segmentation(Coseg)•Videoco-segmentationautomaticallyidentifiesandclustersspatio-temporalsegmentsofvideothatsharesimilartextureandcolorcharacteristics.

•Framesofhigh-resolutionvideoarepre-processedbycoarseningeachframetoaregulargridofrectangularsuper-pixels.

•TheCoSegalgorithmpredictsthebestlabel(e.g.,sky,building,grassetc.)foreachsuperpixelusingGaussianMixtureModel(GMM)inconjunctionwithLoopyBeliefPropagation(LBP).

Page 45: GraphLab

• ThetwoalgorithmsarecombinedtoformanExpectation-MaximizationproblemalternatingbetweenLBPtocomputethelabelforeachsuper-pixelgiventheGMMandthenupdatingtheGMMgiventhelabelsfromLBP.

•TheGraphLabupdatefunctionexecutestheLBPlocaliterativeupdatewhereupdatesexpectedtochangevaluessignificantlyare prioritized.TheBPupdatefunctionisas:

Page 46: GraphLab
Page 47: GraphLab

•Thelockingengineprovidesnearlyoptimalweakscaling:theruntimedoesnotincreasesignificantlyasthesizeofthegraphincreasesproportionatelywiththenumberofmachines.

•Itwasalsoobservedthatincreasingthepipelinelengthincreasedperformancesignificantlyandcompensatedforpoorpartitioning.

Page 48: GraphLab

NamedEntityRecognition(NER)• NamedEntityRecognitionisthetaskofdeterminingthetype(e.g.,Person,Place,orThing)ofanoun-phrase (e.g.Obama,Chicago,orCar)fromitscontext (e.g.“President..”,“Livesnear..”,or“boughta..”).•TheDatagraphforNERisbipartitewithonesetofverticescorrespondingtonoun-phrasesandtheothertocontexts.•TheCoEMvertexprogramupdatesestimateddistributionforavertex(eithernoun-phraseorcontext)basedonthecurrentdistributionforneighboringvertices.

Page 49: GraphLab

•BelowisCoEMalgorithminwhichadjacentverticesarerescheduled,ifthetypedistributionsatavertexchangesbymorethansomepredefinedthreshold.

Page 50: GraphLab

NERComparisons•GraphLabimplementationofNERachieved20-30xspeedupoverHadoopandwascomparabletotheoptimizedMPI.•GraphLabscaledpoorlyachievingonlya3ximprovementusing16xmoremachines,majorlyduetolargevertexdatasize,denseconnectivity,andpoorpartitioning.

Page 51: GraphLab

Comparison(Netflix/CoSeg/NER)

Page 52: GraphLab

Comparison(Netflix/CoSeg/NER)

•Overallnetworkutilization:NetflixandCoSeghaveverylowbandwidthrequirementswhileNERappearstosaturatewhen#machines>24.•Snapshotoverhead:Overheadofperformingacompletesnapshotofthegraphevery|V|updatesishighestforCoSeg,whenrunningona64machinecluster.

Page 53: GraphLab

EC2CostEvaluation

• Theprice-runtimecurves(log-logscale)forGraphLabandHadoopillustratethecostofdeployingeithersystem.

•FortheNetflixapplication,GraphLabisabouttwoordersofmagnitudemorecost-effectivethanHadoop.

Page 54: GraphLab

Conclusion•Inthepaperwetalkedabout:

•RequirementofMLDMalgorithms.•Graphlabextendedtothedistributedsettingby:

•Relaxingtheschedulerequirements•Introducinganewdistributeddata-graph•Introducingnewexecutionengines•Introducingfaulttolerance.

•DistributedGraphlaboutperformsHadoopby20-60xandiscompetitivewithtailoredMPIimplementations.

Page 55: GraphLab

FutureWork• Extendingabstractionandruntimetosupportdynamicallyevolvinggraphsandexternalstorageingraphdatabases.•FurtherresearchintotheoryandapplicationofDynamicasynchronousgraphparallelcomputationthushelpingindefiningemergingfieldofbiglearning.

Page 56: GraphLab

References•Y.Low,J.Gonzalez,A.Kyrola,D.Bickson,C.Guestrin,andJ.M.Hellerstein.Graphlab:Anewparallelframeworkformachinelearning.InUAI,pages340–349,2010.

•Y.Low,J.Gonzalez,A.Kyrola,D.Bickson,C.Guestrin,andJ.M.Hellerstein.DistributedGraphlab:AFrameworkforMachineLearningandDataMiningintheCloud.

•Thesis:ParallelandDistributedSystemsforProbabilisticReasoningJosephGonzalez,CMU-ML-12-111,December21,2012.

•GraphLabpptbyYuchengLow,JosephGonzalez,AapoKyrola,DannyBickson,CarlosGuestrin.

•Y.Zhou,D.Wilkinson,R.Schreiber,andR.Pan.Large-scaleparallelcollaborativefilteringforthenetflixprize.InAAIM,pages337–348,2008.

•ChristopherR.Aberger.Recommender:AnAnalysisofCollaborativeFilteringTechniques.•GraphLabcreate:http://graphlab.org.•CS-425DistributedSystemsppt:Chandy-LamportSnapshotAlgorithmandMulticastCommunicationbyKlaraNahrstedt

• CSE547GraphParallelProblemsSynchronousvsAsynchronousComputationpdfbyEmilyFox.

Page 57: GraphLab